scythe-context-mcp 0.1.4 → 0.1.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/.env.example CHANGED
@@ -18,4 +18,7 @@ SCYTHE_CONTEXT_MAX_CHUNKS_PER_FILE=80
18
18
  SCYTHE_CONTEXT_EMBEDDING_BATCH_SIZE=16
19
19
  SCYTHE_CONTEXT_MAX_EMBEDDING_CHUNKS=256
20
20
 
21
+ # auto | off
22
+ SCYTHE_CONTEXT_RERANK_MODE=auto
23
+
21
24
  # Legacy REPO_BEACON_* variables are still accepted as fallback during migration.
package/CHANGELOG.md CHANGED
@@ -6,6 +6,22 @@ This project follows semantic versioning before npm publication where practical.
6
6
 
7
7
  ## [Unreleased]
8
8
 
9
+ ## [0.1.6] - 2026-06-14
10
+
11
+ ### Added
12
+
13
+ - Add `repo_doctor` local diagnostics for runtime, native modules, Gemini env, WSL interop, and index health without external API calls.
14
+ - Expand the context-search benchmark to cover 30 repo-self lookup cases and exclude benchmark case text from scoring.
15
+ - Add `SCYTHE_CONTEXT_RERANK_MODE=auto|off` and benchmark `--rerank auto|off` for ranking diagnostics.
16
+ - Add a repo-local provider capability cache so Gemini-compatible batch support and dimensionality observations are reused across probes and embedding indexing.
17
+
18
+ ## [0.1.5] - 2026-06-14
19
+
20
+ ### Added
21
+
22
+ - Add a context-search benchmark comparing `rg`, keyword-only Scythe search, and Gemini-backed hybrid search.
23
+ - Add a local code-aware reranker that uses path, snippet, symbol, import graph, file-role, and source-counterpart signals without extra model/API calls.
24
+
9
25
  ## [0.1.4] - 2026-06-14
10
26
 
11
27
  ### Added
package/README.en.md CHANGED
@@ -149,7 +149,8 @@ enabled_tools = [
149
149
  "repo_context_pack",
150
150
  "repo_semantic_search",
151
151
  "repo_related_files",
152
- "gemini_embedding_probe"
152
+ "gemini_embedding_probe",
153
+ "repo_doctor"
153
154
  ]
154
155
  ```
155
156
 
@@ -157,6 +158,10 @@ enabled_tools = [
157
158
 
158
159
  If you really want to pin one default project, set `SCYTHE_CONTEXT_DEFAULT_PROJECT` under `[mcp_servers.scythe_context.env]`. Normal multi-repo usage should not need this; Scythe prefers a tool call's `project_path`, then `PWD`, then the MCP process `cwd`.
159
160
 
161
+ `SCYTHE_CONTEXT_RERANK_MODE` can be `auto` or `off`. The default `auto` enables the local code-aware reranker; set it to `off` temporarily when diagnosing ranking behavior and comparing against the raw semantic/keyword merge.
162
+
163
+ Scythe stores observed Gemini-compatible provider capabilities in repo-local `.scythe-context/provider-capabilities.json`, including whether batch embedding works, whether output dimensionality matches the expected size, and the latest probe / success / failure timestamps. This file is not committed; `repo_reindex(index_embeddings=true)` uses it to avoid repeatedly trying a batch endpoint that is already known to be unsupported.
164
+
160
165
  ### Gemini / v1beta proxy
161
166
 
162
167
  If URL/model/auth are not set, Scythe uses the official Gemini-compatible defaults:
@@ -243,14 +248,15 @@ Use `PWD/p` only if you intentionally run a Windows Node process and need WSL to
243
248
  | `repo_semantic_search` | Runs hybrid or semantic search over indexed chunks; useful for ranking diagnostics. |
244
249
  | `repo_related_files` | Shows symbols, imports, and importedBy for one file. |
245
250
  | `gemini_embedding_probe` | Tests Gemini or proxy compatibility and returns endpoint, latency, error classification, and remediation hints. |
251
+ | `repo_doctor` | Checks Node runtime, native modules, Gemini env, provider capability cache, WSL interop, and index health without calling external APIs. |
246
252
 
247
253
  `repo_context_pack(mode="hybrid")` and `repo_semantic_search(mode="hybrid")` degrade to keyword-only results when query embedding is unavailable, returning `effectiveMode: "keyword"` and `fallback.reason: "embedding_unavailable"`. `mode="semantic"` does not degrade and returns `status: "embedding_unavailable"` because pure semantic search requires query embedding. Use `rg` / direct file reads for exact strings, known paths, or small targeted checks.
248
254
 
249
255
  ## Feature Status
250
256
 
251
- Implemented: repo scanning, chunking, SQLite metadata, SQLite FTS5, sqlite-vec, Gemini Embedding 2 provider, semantic/keyword/hybrid search, lightweight symbol/dependency graph, related-file lookup, `repo_context_pack`, provider diagnostics, and index freshness diagnostics.
257
+ Implemented: repo scanning, chunking, SQLite metadata, SQLite FTS5, sqlite-vec, Gemini Embedding 2 provider, semantic/keyword/hybrid search, keyword-only fallback when embeddings fail, local code-aware reranker, lightweight symbol/dependency graph, related-file lookup, `repo_context_pack`, provider diagnostics, provider capability cache, index freshness diagnostics, and `repo_doctor`.
252
258
 
253
- Next: provider capability cache, install/native dependency doctor, keyword-only fallback when embeddings fail, and tree-sitter symbol extraction if needed.
259
+ Next: expanded benchmark cases, remediation-message polish, and tree-sitter symbol extraction if needed.
254
260
 
255
261
  ## Privacy and Local Files
256
262
 
@@ -268,6 +274,7 @@ Do not include API keys, proxy tokens, private source snippets, or index databas
268
274
  - [Gemini Compatibility](docs/gemini-compatibility.md)
269
275
  - [Tech Stack](docs/tech-stack.md)
270
276
  - [Codex Integration Review](docs/codex-integration.md)
277
+ - [Context Search Benchmark](docs/benchmark.md)
271
278
 
272
279
  ## Development and Publishing Checks
273
280
 
package/README.md CHANGED
@@ -149,7 +149,8 @@ enabled_tools = [
149
149
  "repo_context_pack",
150
150
  "repo_semantic_search",
151
151
  "repo_related_files",
152
- "gemini_embedding_probe"
152
+ "gemini_embedding_probe",
153
+ "repo_doctor"
153
154
  ]
154
155
  ```
155
156
 
@@ -157,6 +158,10 @@ enabled_tools = [
157
158
 
158
159
  如果你真的想固定某一個預設專案,可以在 `[mcp_servers.scythe_context.env]` 設 `SCYTHE_CONTEXT_DEFAULT_PROJECT`。一般多 repo 使用不需要這樣做;Scythe 會優先使用工具呼叫的 `project_path`,再使用 `PWD`,最後才使用 MCP process 的 `cwd`。
159
160
 
161
+ `SCYTHE_CONTEXT_RERANK_MODE` 可設為 `auto` 或 `off`。預設 `auto` 會啟用 local code-aware reranker;排查 ranking 問題時可暫時設為 `off`,回到 semantic/keyword merge 的原始排序。
162
+
163
+ Scythe 會在 repo-local `.scythe-context/provider-capabilities.json` 記錄目前 Gemini-compatible provider 的能力觀察結果,例如 batch embedding 是否可用、output dimensionality 是否符合預期,以及最近一次 probe / success / failure。這個檔案不提交;`repo_reindex(index_embeddings=true)` 會使用它避免反覆嘗試已知不支援的 batch endpoint。
164
+
160
165
  ### Gemini / v1beta proxy
161
166
 
162
167
  如果不填 URL/model/auth,預設會使用官方 Gemini 相容設定:
@@ -243,14 +248,15 @@ GEMINI_OUTPUT_DIMENSIONALITY = "1536"
243
248
  | `repo_semantic_search` | 對已索引 chunks 做 hybrid 或 semantic search,適合排查 ranking。 |
244
249
  | `repo_related_files` | 查看單一檔案的 symbols、imports、importedBy。 |
245
250
  | `gemini_embedding_probe` | 測試 Gemini 或 proxy 相容性,回傳 endpoint、latency、錯誤分類與可修復建議。 |
251
+ | `repo_doctor` | 不呼叫外部 API,檢查 Node runtime、native modules、Gemini env、provider capability cache、WSL interop 與 index health。 |
246
252
 
247
253
  `repo_context_pack(mode="hybrid")` 和 `repo_semantic_search(mode="hybrid")` 在 query embedding 不可用時會降級成 keyword-only 結果,並回傳 `effectiveMode: "keyword"` 與 `fallback.reason: "embedding_unavailable"`。`mode="semantic"` 不會降級,會回傳 `status: "embedding_unavailable"`,因為純 semantic search 必須有 query embedding。精確字串、已知路徑或小範圍檢查仍建議直接用 `rg` / 直接讀檔。
248
254
 
249
255
  ## 功能狀態
250
256
 
251
- 已完成:repo 掃描、chunking、SQLite metadata、SQLite FTS5、sqlite-vec、Gemini Embedding 2 provider、semantic/keyword/hybrid search、輕量 symbol/dependency graph、related-file lookup、`repo_context_pack`、provider diagnostics、index freshness diagnostics
257
+ 已完成:repo 掃描、chunking、SQLite metadata、SQLite FTS5、sqlite-vec、Gemini Embedding 2 provider、semantic/keyword/hybrid search、embedding 失敗時的 keyword-only fallback、local code-aware reranker、輕量 symbol/dependency graph、related-file lookup、`repo_context_pack`、provider diagnostics、provider capability cache、index freshness diagnostics、`repo_doctor`。
252
258
 
253
- 下一步:provider capability cache、安裝/原生依賴 doctor、embedding 失敗時的 keyword-only fallback、必要時加入 tree-sitter symbol extraction。
259
+ 下一步:擴充 benchmark cases、錯誤修復提示 polish、必要時加入 tree-sitter symbol extraction。
254
260
 
255
261
  ## 隱私與本機檔案
256
262
 
@@ -268,6 +274,7 @@ GEMINI_OUTPUT_DIMENSIONALITY = "1536"
268
274
  - [Gemini 相容性](docs/gemini-compatibility.md)
269
275
  - [技術棧](docs/tech-stack.md)
270
276
  - [Codex 整合審查](docs/codex-integration.md)
277
+ - [Context search benchmark](docs/benchmark.md)
271
278
 
272
279
  ## 開發與發佈檢查
273
280
 
package/README.zh-CN.md CHANGED
@@ -149,7 +149,8 @@ enabled_tools = [
149
149
  "repo_context_pack",
150
150
  "repo_semantic_search",
151
151
  "repo_related_files",
152
- "gemini_embedding_probe"
152
+ "gemini_embedding_probe",
153
+ "repo_doctor"
153
154
  ]
154
155
  ```
155
156
 
@@ -157,6 +158,10 @@ enabled_tools = [
157
158
 
158
159
  如果你真的想固定某一个默认项目,可以在 `[mcp_servers.scythe_context.env]` 设置 `SCYTHE_CONTEXT_DEFAULT_PROJECT`。一般多 repo 使用不需要这样做;Scythe 会优先使用工具调用的 `project_path`,再使用 `PWD`,最后才使用 MCP process 的 `cwd`。
159
160
 
161
+ `SCYTHE_CONTEXT_RERANK_MODE` 可设为 `auto` 或 `off`。默认 `auto` 会启用 local code-aware reranker;排查 ranking 问题时可暂时设为 `off`,回到 semantic/keyword merge 的原始排序。
162
+
163
+ Scythe 会在 repo-local `.scythe-context/provider-capabilities.json` 记录当前 Gemini-compatible provider 的能力观察结果,例如 batch embedding 是否可用、output dimensionality 是否符合预期,以及最近一次 probe / success / failure。这个文件不提交;`repo_reindex(index_embeddings=true)` 会使用它避免反复尝试已知不支持的 batch endpoint。
164
+
160
165
  ### Gemini / v1beta proxy
161
166
 
162
167
  如果不填 URL/model/auth,默认会使用官方 Gemini 兼容配置:
@@ -243,14 +248,15 @@ GEMINI_OUTPUT_DIMENSIONALITY = "1536"
243
248
  | `repo_semantic_search` | 对已索引 chunks 做 hybrid 或 semantic search,适合排查 ranking。 |
244
249
  | `repo_related_files` | 查看单一文件的 symbols、imports、importedBy。 |
245
250
  | `gemini_embedding_probe` | 测试 Gemini 或 proxy 兼容性,返回 endpoint、latency、错误分类与可修复建议。 |
251
+ | `repo_doctor` | 不调用外部 API,检查 Node runtime、native modules、Gemini env、provider capability cache、WSL interop 与 index health。 |
246
252
 
247
253
  `repo_context_pack(mode="hybrid")` 和 `repo_semantic_search(mode="hybrid")` 在 query embedding 不可用时会降级成 keyword-only 结果,并返回 `effectiveMode: "keyword"` 与 `fallback.reason: "embedding_unavailable"`。`mode="semantic"` 不会降级,会返回 `status: "embedding_unavailable"`,因为纯 semantic search 必须有 query embedding。精确字符串、已知路径或小范围检查仍建议直接用 `rg` / 直接读文件。
248
254
 
249
255
  ## 功能状态
250
256
 
251
- 已完成:repo 扫描、chunking、SQLite metadata、SQLite FTS5、sqlite-vec、Gemini Embedding 2 provider、semantic/keyword/hybrid search、轻量 symbol/dependency graph、related-file lookup、`repo_context_pack`、provider diagnostics、index freshness diagnostics
257
+ 已完成:repo 扫描、chunking、SQLite metadata、SQLite FTS5、sqlite-vec、Gemini Embedding 2 provider、semantic/keyword/hybrid search、embedding 失败时的 keyword-only fallback、local code-aware reranker、轻量 symbol/dependency graph、related-file lookup、`repo_context_pack`、provider diagnostics、provider capability cache、index freshness diagnostics、`repo_doctor`。
252
258
 
253
- 下一步:provider capability cache、安装/原生依赖 doctor、embedding 失败时的 keyword-only fallback、必要时加入 tree-sitter symbol extraction。
259
+ 下一步:扩充 benchmark cases、错误修复提示 polish、必要时加入 tree-sitter symbol extraction。
254
260
 
255
261
  ## 隐私与本地文件
256
262
 
@@ -268,6 +274,7 @@ GEMINI_OUTPUT_DIMENSIONALITY = "1536"
268
274
  - [Gemini 兼容性](docs/gemini-compatibility.md)
269
275
  - [技术栈](docs/tech-stack.md)
270
276
  - [Codex 集成审查](docs/codex-integration.md)
277
+ - [Context search benchmark](docs/benchmark.md)
271
278
 
272
279
  ## 开发与发布检查
273
280
 
@@ -0,0 +1,291 @@
1
+ [
2
+ {
3
+ "id": "embedding-fallback",
4
+ "query": "embedding unavailable should fall back to keyword-only context pack",
5
+ "expectedPaths": [
6
+ "src/tools/registerTools.ts",
7
+ "src/indexing/hybridSearch.ts"
8
+ ],
9
+ "notes": "Task-style query for the hybrid-to-keyword fallback path."
10
+ },
11
+ {
12
+ "id": "stable-chunk-ids",
13
+ "query": "preserve stable chunk row ids so embedding cache remains useful after reindex",
14
+ "expectedPaths": [
15
+ "src/indexing/indexWriter.ts",
16
+ "src/storage/schema.ts",
17
+ "src/storage/sqliteVec.test.ts"
18
+ ],
19
+ "notes": "Looks for storage and reindex behavior tied to embedding reuse."
20
+ },
21
+ {
22
+ "id": "utf8-binary-detection",
23
+ "query": "UTF-8 scanner should not treat a file as binary when prefix ends in multibyte character",
24
+ "expectedPaths": [
25
+ "src/indexing/binary.ts",
26
+ "src/indexing/scanner.test.ts"
27
+ ],
28
+ "notes": "Regression coverage for text detection around multibyte boundaries."
29
+ },
30
+ {
31
+ "id": "codex-wsl-config",
32
+ "query": "Codex App WSL Windows node setup with PWD and WSLENV",
33
+ "expectedPaths": [
34
+ "README.md",
35
+ "docs/codex-integration.md",
36
+ "src/config.ts"
37
+ ],
38
+ "notes": "Documentation and config lookup for the Windows Node plus WSL workspace mode."
39
+ },
40
+ {
41
+ "id": "related-file-graph",
42
+ "query": "related files imports reverse imports graph for context pack",
43
+ "expectedPaths": [
44
+ "src/indexing/relatedFiles.ts",
45
+ "src/indexing/contextPack.ts"
46
+ ],
47
+ "notes": "Finds symbol and dependency graph code used by repo_context_pack."
48
+ },
49
+ {
50
+ "id": "gemini-proxy-url",
51
+ "query": "Gemini v1beta compatible proxy base URL bearer auth output dimensionality",
52
+ "expectedPaths": [
53
+ "src/providers/gemini.ts",
54
+ "src/config.ts",
55
+ "docs/gemini-compatibility.md"
56
+ ],
57
+ "notes": "Provider compatibility query with base URL and auth details."
58
+ },
59
+ {
60
+ "id": "npm-bin-mode",
61
+ "query": "npm package executable bin mode should be checked before publish",
62
+ "expectedPaths": [
63
+ "scripts/bin-mode.mjs",
64
+ "package.json",
65
+ "src/cli.ts"
66
+ ],
67
+ "notes": "Release packaging and executable bit smoke coverage."
68
+ },
69
+ {
70
+ "id": "fts-keyword-search",
71
+ "query": "FTS trigram keyword search ranks chunks by bm25 and file path fallback",
72
+ "expectedPaths": [
73
+ "src/indexing/keywordSearch.ts",
74
+ "src/storage/schema.ts"
75
+ ],
76
+ "notes": "Keyword search internals and schema lookup."
77
+ },
78
+ {
79
+ "id": "doctor-native-modules",
80
+ "query": "diagnose better-sqlite3 sqlite-vec native module load failure without external API calls",
81
+ "expectedPaths": [
82
+ "src/tools/doctor.ts",
83
+ "src/storage/sqliteVec.ts",
84
+ "src/tools/doctor.test.ts"
85
+ ],
86
+ "notes": "Install/runtime diagnostics for native dependencies."
87
+ },
88
+ {
89
+ "id": "doctor-embedding-coverage",
90
+ "query": "repo doctor should warn when embedding coverage is incomplete",
91
+ "expectedPaths": [
92
+ "src/tools/doctor.ts",
93
+ "src/tools/doctor.test.ts"
94
+ ],
95
+ "notes": "Doctor index health and embedding coverage warning behavior."
96
+ },
97
+ {
98
+ "id": "register-repo-doctor-tool",
99
+ "query": "where is repo_doctor registered as an MCP tool",
100
+ "expectedPaths": [
101
+ "src/tools/registerTools.ts",
102
+ "src/tools/doctor.ts"
103
+ ],
104
+ "notes": "Exact tool wiring lookup."
105
+ },
106
+ {
107
+ "id": "server-instructions-rg-vs-context",
108
+ "query": "server instructions tell Codex when to use Scythe Context and when rg is better",
109
+ "expectedPaths": [
110
+ "src/index.ts",
111
+ "README.md",
112
+ "docs/codex-integration.md"
113
+ ],
114
+ "notes": "Prompt/tool-use policy lookup."
115
+ },
116
+ {
117
+ "id": "cli-version-help",
118
+ "query": "CLI --version and --help should match package version and show MCP config",
119
+ "expectedPaths": [
120
+ "src/cli.ts",
121
+ "src/cli.test.ts",
122
+ "package.json"
123
+ ],
124
+ "notes": "CLI smoke and version consistency."
125
+ },
126
+ {
127
+ "id": "config-default-project-pwd",
128
+ "query": "default project path should prefer SCYTHE_CONTEXT_DEFAULT_PROJECT then PWD then cwd",
129
+ "expectedPaths": [
130
+ "src/config.ts",
131
+ "src/config.test.ts"
132
+ ],
133
+ "notes": "Multi-repo default project behavior."
134
+ },
135
+ {
136
+ "id": "legacy-env-aliases",
137
+ "query": "legacy REPO_BEACON environment variables should still work as fallback aliases",
138
+ "expectedPaths": [
139
+ "src/config.ts",
140
+ "src/config.test.ts",
141
+ "README.md"
142
+ ],
143
+ "notes": "Rename compatibility and config migration."
144
+ },
145
+ {
146
+ "id": "gemini-batch-fallback",
147
+ "query": "embedding writer should fall back from batch embedding to single embed requests",
148
+ "expectedPaths": [
149
+ "src/indexing/embeddingWriter.ts",
150
+ "src/indexing/embeddingWriter.test.ts"
151
+ ],
152
+ "notes": "Provider capability and batch compatibility behavior."
153
+ },
154
+ {
155
+ "id": "gemini-error-classification",
156
+ "query": "Gemini provider should classify retryable HTTP errors and include safe body snippets",
157
+ "expectedPaths": [
158
+ "src/providers/gemini.ts",
159
+ "src/providers/gemini.test.ts"
160
+ ],
161
+ "notes": "Provider error handling lookup."
162
+ },
163
+ {
164
+ "id": "context-budgeting-truncation",
165
+ "query": "context pack should respect max context chars and mark truncated snippets",
166
+ "expectedPaths": [
167
+ "src/indexing/contextPack.ts",
168
+ "src/indexing/resultFormat.ts",
169
+ "src/indexing/contextPack.test.ts",
170
+ "src/indexing/resultFormat.test.ts"
171
+ ],
172
+ "notes": "Context budget and formatting behavior."
173
+ },
174
+ {
175
+ "id": "related-snippet-packing",
176
+ "query": "include related snippets for non-primary related files in context pack",
177
+ "expectedPaths": [
178
+ "src/indexing/relatedSnippets.ts",
179
+ "src/tools/registerTools.ts",
180
+ "src/indexing/contextPack.test.ts"
181
+ ],
182
+ "notes": "Related context expansion query."
183
+ },
184
+ {
185
+ "id": "code-aware-reranker-source-counterpart",
186
+ "query": "local code-aware reranker should add source counterpart when matched test file contains query terms",
187
+ "expectedPaths": [
188
+ "src/indexing/codeAwareReranker.ts",
189
+ "src/indexing/hybridSearch.test.ts",
190
+ "src/indexing/hybridSearch.ts"
191
+ ],
192
+ "notes": "Reranker source/test relation behavior."
193
+ },
194
+ {
195
+ "id": "path-role-ranking",
196
+ "query": "ranking should boost source files and downrank docs tests generated files depending on query intent",
197
+ "expectedPaths": [
198
+ "src/indexing/codeAwareReranker.ts",
199
+ "src/indexing/relatedFiles.ts"
200
+ ],
201
+ "notes": "File role and intent based ranking."
202
+ },
203
+ {
204
+ "id": "scanner-ignore-local-secrets",
205
+ "query": "scanner should ignore local secrets references screenshots and index directories",
206
+ "expectedPaths": [
207
+ "src/indexing/scanner.ts",
208
+ ".gitignore",
209
+ "AGENTS.md"
210
+ ],
211
+ "notes": "Privacy and ignored files behavior."
212
+ },
213
+ {
214
+ "id": "dry-run-indexing-stats",
215
+ "query": "dry run should scan project and report skipped files chunks bytes without writing metadata",
216
+ "expectedPaths": [
217
+ "src/indexing/dryRun.ts",
218
+ "src/indexing/dryRun.test.ts"
219
+ ],
220
+ "notes": "Safe indexing preview lookup."
221
+ },
222
+ {
223
+ "id": "sqlite-schema-fts-vector",
224
+ "query": "SQLite schema creates chunk FTS virtual table and sqlite-vec vector table",
225
+ "expectedPaths": [
226
+ "src/storage/schema.ts",
227
+ "src/storage/schema.test.ts",
228
+ "src/storage/sqliteVec.ts"
229
+ ],
230
+ "notes": "Storage schema internals."
231
+ },
232
+ {
233
+ "id": "dependency-resolution-relative-imports",
234
+ "query": "resolve relative imports to active project paths for TypeScript Python Rust and Go",
235
+ "expectedPaths": [
236
+ "src/indexing/symbolGraph.ts",
237
+ "src/indexing/symbolGraph.test.ts"
238
+ ],
239
+ "notes": "Symbol/dependency graph extraction."
240
+ },
241
+ {
242
+ "id": "semantic-vector-search-sqlite-vec",
243
+ "query": "semantic search uses sqlite-vec table and query vector dimensions",
244
+ "expectedPaths": [
245
+ "src/indexing/semanticSearch.ts",
246
+ "src/indexing/semanticSearch.test.ts",
247
+ "src/storage/sqliteVec.ts"
248
+ ],
249
+ "notes": "Raw semantic vector search behavior."
250
+ },
251
+ {
252
+ "id": "benchmark-rg-baseline",
253
+ "query": "benchmark compares rg smart baseline against scythe keyword and hybrid search",
254
+ "expectedPaths": [
255
+ "scripts/context-benchmark.mjs",
256
+ "benchmarks/context-search-cases.json",
257
+ "docs/benchmark.md"
258
+ ],
259
+ "notes": "Benchmark harness and no-MCP baseline."
260
+ },
261
+ {
262
+ "id": "npm-package-files",
263
+ "query": "npm package should include dist docs benchmark cases and context benchmark script but not local secrets",
264
+ "expectedPaths": [
265
+ "package.json",
266
+ ".gitignore",
267
+ "scripts/context-benchmark.mjs"
268
+ ],
269
+ "notes": "Package files and privacy boundary."
270
+ },
271
+ {
272
+ "id": "github-ci-release-checks",
273
+ "query": "CI should run tests build smoke audit before release",
274
+ "expectedPaths": [
275
+ ".github/workflows/ci.yml",
276
+ "package.json",
277
+ "CONTRIBUTING.md"
278
+ ],
279
+ "notes": "Contributor and CI workflow lookup."
280
+ },
281
+ {
282
+ "id": "security-privacy-embedding-text",
283
+ "query": "security policy should explain embedding text goes to configured remote provider and local indexes are not committed",
284
+ "expectedPaths": [
285
+ "SECURITY.md",
286
+ "README.md",
287
+ "AGENTS.md"
288
+ ],
289
+ "notes": "Privacy docs and remote embedding risk."
290
+ }
291
+ ]
package/dist/cli.js CHANGED
@@ -1,4 +1,4 @@
1
- export const PACKAGE_VERSION = "0.1.4";
1
+ export const PACKAGE_VERSION = "0.1.6";
2
2
  export function parseCliArgs(args) {
3
3
  if (args.length === 0)
4
4
  return { kind: "serve" };
@@ -31,5 +31,6 @@ Environment:
31
31
  GEMINI_API_KEY Gemini or Gemini-compatible API key
32
32
  GEMINI_BASE_URL Gemini-compatible base URL, default https://generativelanguage.googleapis.com/v1beta
33
33
  GEMINI_OUTPUT_DIMENSIONALITY Embedding dimensions, default 1536
34
+ SCYTHE_CONTEXT_RERANK_MODE auto or off, default auto
34
35
  `;
35
36
  }
package/dist/config.js CHANGED
@@ -33,6 +33,13 @@ function authModeFromEnv(value) {
33
33
  }
34
34
  throw new Error("GEMINI_AUTH_MODE must be one of: x-goog-api-key, bearer, query");
35
35
  }
36
+ function rerankModeFromEnv(value) {
37
+ if (!value)
38
+ return "auto";
39
+ if (value === "auto" || value === "off")
40
+ return value;
41
+ throw new Error("SCYTHE_CONTEXT_RERANK_MODE must be one of: auto, off");
42
+ }
36
43
  function defaultProjectPathFromEnv() {
37
44
  const explicitProject = envValue("SCYTHE_CONTEXT_DEFAULT_PROJECT", "REPO_BEACON_DEFAULT_PROJECT");
38
45
  if (explicitProject)
@@ -61,6 +68,9 @@ export function loadConfig() {
61
68
  embeddingBatchSize: numberFromEnvAlias("SCYTHE_CONTEXT_EMBEDDING_BATCH_SIZE", "REPO_BEACON_EMBEDDING_BATCH_SIZE", 16) ?? 16,
62
69
  maxEmbeddingChunks: numberFromEnvAlias("SCYTHE_CONTEXT_MAX_EMBEDDING_CHUNKS", "REPO_BEACON_MAX_EMBEDDING_CHUNKS", 256) ?? 256,
63
70
  },
71
+ search: {
72
+ rerankMode: rerankModeFromEnv(envValue("SCYTHE_CONTEXT_RERANK_MODE", "REPO_BEACON_RERANK_MODE")),
73
+ },
64
74
  gemini: {
65
75
  apiKey: process.env.GEMINI_API_KEY,
66
76
  baseUrl: process.env.GEMINI_BASE_URL || "https://generativelanguage.googleapis.com/v1beta",