ownsearch 0.1.4 → 0.1.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +59 -5
- package/dist/{chunk-ZQAY3FE3.js → chunk-TBXFY4OJ.js} +232 -2
- package/dist/cli.js +224 -39
- package/dist/mcp/server.js +305 -21
- package/package.json +1 -1
- package/skills/ownsearch-rag-search/SKILL.md +30 -3
package/README.md
CHANGED
|
@@ -25,6 +25,12 @@ Most agents waste time and tokens when they do one of two things:
|
|
|
25
25
|
- giving agents a structured way to fetch only the chunks they need
|
|
26
26
|
- improving answer quality with reranking, deduplication, and grounded chunk access
|
|
27
27
|
|
|
28
|
+
It uses a hybrid retrieval surface rather than treating embeddings as a full replacement for exact search:
|
|
29
|
+
|
|
30
|
+
- `literal_search` for exact names, titles, IDs, and quoted phrases
|
|
31
|
+
- `search_context` for the normal fast semantic path
|
|
32
|
+
- `deep_search_context` for archive-style, multi-document, or ambiguity-heavy questions
|
|
33
|
+
|
|
28
34
|
## Core use cases
|
|
29
35
|
|
|
30
36
|
`ownsearch` is a good fit when an agent needs to work over:
|
|
@@ -66,6 +72,7 @@ What is already strong in the current package:
|
|
|
66
72
|
- support for common text document formats
|
|
67
73
|
- large plain text and code files are no longer blocked by the extracted-document size cap
|
|
68
74
|
- repeatable smoke validation for mixed text corpora
|
|
75
|
+
- a hybrid retrieval interface that works better for agents than embeddings alone
|
|
69
76
|
|
|
70
77
|
## V1 supported document types
|
|
71
78
|
|
|
@@ -93,6 +100,8 @@ Installation:
|
|
|
93
100
|
npm install -g ownsearch
|
|
94
101
|
```
|
|
95
102
|
|
|
103
|
+
Gemini API usage is governed by Google’s current free-tier limits, quotas, and pricing.
|
|
104
|
+
|
|
96
105
|
Deployment checklist:
|
|
97
106
|
|
|
98
107
|
```bash
|
|
@@ -115,19 +124,34 @@ ownsearch setup
|
|
|
115
124
|
ownsearch doctor
|
|
116
125
|
ownsearch index ./docs --name docs
|
|
117
126
|
ownsearch list-roots
|
|
127
|
+
ownsearch literal-search "exact title or phrase" --limit 10
|
|
118
128
|
ownsearch search "what is this repo about?" --limit 5
|
|
119
129
|
ownsearch search-context "what is this repo about?" --limit 8 --max-chars 12000
|
|
130
|
+
ownsearch deep-search-context "what is this repo about?" --final-limit 10 --max-chars 16000
|
|
120
131
|
ownsearch serve-mcp
|
|
121
132
|
```
|
|
122
133
|
|
|
123
134
|
On first run, `ownsearch setup` can:
|
|
124
135
|
|
|
125
136
|
- prompt for `GEMINI_API_KEY`
|
|
126
|
-
-
|
|
137
|
+
- open Google AI Studio automatically
|
|
127
138
|
- save the key to `~/.ownsearch/.env`
|
|
139
|
+
- validate the pasted key before saving it
|
|
140
|
+
- ask whether setup output should be optimized for a human or an agent
|
|
128
141
|
- print exact next commands for CLI and MCP usage
|
|
129
142
|
- optionally print an MCP config snippet for a selected agent
|
|
130
143
|
|
|
144
|
+
Gemini API usage is governed by Google’s current free-tier limits, quotas, and pricing.
|
|
145
|
+
|
|
146
|
+
Useful setup modes:
|
|
147
|
+
|
|
148
|
+
```bash
|
|
149
|
+
ownsearch setup
|
|
150
|
+
ownsearch setup --audience human
|
|
151
|
+
ownsearch setup --audience agent
|
|
152
|
+
ownsearch setup --json
|
|
153
|
+
```
|
|
154
|
+
|
|
131
155
|
## Real-world fit
|
|
132
156
|
|
|
133
157
|
`ownsearch` is a strong fit for:
|
|
@@ -186,7 +210,7 @@ ownsearch print-skill ownsearch-rag-search
|
|
|
186
210
|
The skill is intended to help an agent:
|
|
187
211
|
|
|
188
212
|
- rewrite weak user requests into stronger retrieval queries
|
|
189
|
-
- decide when to use `search_context` vs `
|
|
213
|
+
- decide when to use `literal_search` vs `search_context` vs `deep_search_context` vs `get_chunks`
|
|
190
214
|
- recover from poor first-pass retrieval
|
|
191
215
|
- avoid duplicate-heavy answer synthesis
|
|
192
216
|
- stay grounded when retrieval is probabilistic
|
|
@@ -203,8 +227,12 @@ The skill is intended to help an agent:
|
|
|
203
227
|
Lists approved indexed roots.
|
|
204
228
|
- `ownsearch search "<query>"`
|
|
205
229
|
Returns reranked search hits from the vector store.
|
|
230
|
+
- `ownsearch literal-search "<query>"`
|
|
231
|
+
Runs exact text search with `ripgrep` over indexed roots.
|
|
206
232
|
- `ownsearch search-context "<query>"`
|
|
207
233
|
Returns a compact grounded context bundle for agents.
|
|
234
|
+
- `ownsearch deep-search-context "<query>"`
|
|
235
|
+
Runs a deeper multi-query retrieval pass for ambiguous or archive-style questions.
|
|
208
236
|
- `ownsearch delete-root <rootId>`
|
|
209
237
|
Removes a root from config and deletes its vectors from Qdrant.
|
|
210
238
|
- `ownsearch store-status`
|
|
@@ -220,9 +248,12 @@ The skill is intended to help an agent:
|
|
|
220
248
|
|
|
221
249
|
The MCP server currently exposes:
|
|
222
250
|
|
|
251
|
+
- `get_retrieval_skill`
|
|
223
252
|
- `index_path`
|
|
224
253
|
- `search`
|
|
254
|
+
- `literal_search`
|
|
225
255
|
- `search_context`
|
|
256
|
+
- `deep_search_context`
|
|
226
257
|
- `get_chunks`
|
|
227
258
|
- `list_roots`
|
|
228
259
|
- `delete_root`
|
|
@@ -230,9 +261,11 @@ The MCP server currently exposes:
|
|
|
230
261
|
|
|
231
262
|
Recommended retrieval flow:
|
|
232
263
|
|
|
233
|
-
1. Use `
|
|
234
|
-
2. Use `
|
|
235
|
-
3. Use `
|
|
264
|
+
1. Use `literal_search` when the user gives an exact title, name, identifier, or quoted phrase.
|
|
265
|
+
2. Use `search_context` for fast grounded retrieval.
|
|
266
|
+
3. Use `deep_search_context` for ambiguous, archive-style, or multi-document questions.
|
|
267
|
+
4. Use `search` when ranking and source inspection matter.
|
|
268
|
+
5. Use `get_chunks` when exact wording or detailed comparison matters.
|
|
236
269
|
|
|
237
270
|
## Validation
|
|
238
271
|
|
|
@@ -250,6 +283,25 @@ That smoke run currently validates:
|
|
|
250
283
|
- `.pdf` retrieval
|
|
251
284
|
- large plain text file bypass of the extracted-document byte cap
|
|
252
285
|
|
|
286
|
+
The repo also includes comparative retrieval evals:
|
|
287
|
+
|
|
288
|
+
- `scripts/eval-grep-vs-ownsearch.mts`
|
|
289
|
+
- `scripts/eval-adversarial-retrieval.mts`
|
|
290
|
+
|
|
291
|
+
These evals are meant to expose where:
|
|
292
|
+
|
|
293
|
+
- plain `grep` is still best
|
|
294
|
+
- shallow semantic retrieval is too weak
|
|
295
|
+
- deeper retrieval improves agent-facing RAG quality
|
|
296
|
+
|
|
297
|
+
On the current Mireglass benchmark corpus, the latest comparative run produced:
|
|
298
|
+
|
|
299
|
+
- `deep`: `69.2` average score
|
|
300
|
+
- `grep`: `65.67` average score
|
|
301
|
+
- `shallow`: `65.09` average score
|
|
302
|
+
|
|
303
|
+
The adversarial eval also showed that the current deep path reduced known noise-file leakage the most in this corpus.
|
|
304
|
+
|
|
253
305
|
## Limitations
|
|
254
306
|
|
|
255
307
|
This package is deploy-ready for text-first corpora, but it is not universal document intelligence.
|
|
@@ -269,6 +321,8 @@ Operational limitations:
|
|
|
269
321
|
- extracted document quality depends on source document quality
|
|
270
322
|
- duplicate-heavy corpora are improved by current reranking, but not fully solved for all edge cases
|
|
271
323
|
- scanned or low-quality PDFs may require OCR before indexing
|
|
324
|
+
- `literal_search` depends on `ripgrep` being available on the local machine
|
|
325
|
+
- exact literal lookup can still beat semantic retrieval on some questions, so agents should use the hybrid flow instead of embeddings alone
|
|
272
326
|
|
|
273
327
|
## Future scope
|
|
274
328
|
|
|
@@ -114,6 +114,12 @@ var IGNORED_DIRECTORIES = /* @__PURE__ */ new Set([
|
|
|
114
114
|
"node_modules",
|
|
115
115
|
"venv"
|
|
116
116
|
]);
|
|
117
|
+
var DOWNWEIGHTED_PATH_SUBSTRINGS = [
|
|
118
|
+
"/09_benchmark_queries.txt",
|
|
119
|
+
"/10_extra_hard_notes_for_chunking.txt",
|
|
120
|
+
"09_benchmark_queries.txt",
|
|
121
|
+
"10_extra_hard_notes_for_chunking.txt"
|
|
122
|
+
];
|
|
117
123
|
|
|
118
124
|
// src/utils.ts
|
|
119
125
|
import crypto from "crypto";
|
|
@@ -322,6 +328,26 @@ async function embedQuery(query) {
|
|
|
322
328
|
const [vector] = await embed([query], "RETRIEVAL_QUERY");
|
|
323
329
|
return vector;
|
|
324
330
|
}
|
|
331
|
+
async function validateGeminiApiKey(apiKey) {
|
|
332
|
+
const config = await loadConfig();
|
|
333
|
+
const validationClient = new GoogleGenAI({ apiKey });
|
|
334
|
+
try {
|
|
335
|
+
const response = await validationClient.models.embedContent({
|
|
336
|
+
model: config.embeddingModel,
|
|
337
|
+
contents: ["ownsearch key validation"],
|
|
338
|
+
config: {
|
|
339
|
+
taskType: "RETRIEVAL_QUERY",
|
|
340
|
+
outputDimensionality: config.vectorSize
|
|
341
|
+
}
|
|
342
|
+
});
|
|
343
|
+
if (!response.embeddings?.length || !response.embeddings[0]?.values?.length) {
|
|
344
|
+
throw new OwnSearchError("Gemini key validation returned no embeddings.");
|
|
345
|
+
}
|
|
346
|
+
} catch (error) {
|
|
347
|
+
const message = error instanceof Error ? error.message : String(error);
|
|
348
|
+
throw new OwnSearchError(`Gemini API key validation failed. ${message}`);
|
|
349
|
+
}
|
|
350
|
+
}
|
|
325
351
|
|
|
326
352
|
// src/qdrant.ts
|
|
327
353
|
import { QdrantClient } from "@qdrant/js-client-rest";
|
|
@@ -431,6 +457,10 @@ function rerankAndDeduplicate(query, hits, limit) {
|
|
|
431
457
|
}
|
|
432
458
|
|
|
433
459
|
// src/qdrant.ts
|
|
460
|
+
function isDownweightedPath(relativePath) {
|
|
461
|
+
const lowered = relativePath.toLowerCase();
|
|
462
|
+
return DOWNWEIGHTED_PATH_SUBSTRINGS.some((pattern) => lowered.includes(pattern.toLowerCase()));
|
|
463
|
+
}
|
|
434
464
|
var OwnSearchStore = class {
|
|
435
465
|
constructor(client2, collectionName, vectorSize) {
|
|
436
466
|
this.client = client2;
|
|
@@ -594,7 +624,7 @@ var OwnSearchStore = class {
|
|
|
594
624
|
});
|
|
595
625
|
const hits = results.map((result) => ({
|
|
596
626
|
id: String(result.id),
|
|
597
|
-
score: result.score,
|
|
627
|
+
score: isDownweightedPath(String(result.payload?.relative_path ?? "")) ? result.score * 0.6 : result.score,
|
|
598
628
|
rootId: String(result.payload?.root_id ?? ""),
|
|
599
629
|
rootName: String(result.payload?.root_name ?? ""),
|
|
600
630
|
filePath: String(result.payload?.file_path ?? ""),
|
|
@@ -891,6 +921,203 @@ async function indexPath(rootPath, options = {}) {
|
|
|
891
921
|
};
|
|
892
922
|
}
|
|
893
923
|
|
|
924
|
+
// src/literal-search.ts
|
|
925
|
+
import { execFile as execFileCallback } from "child_process";
|
|
926
|
+
import path5 from "path";
|
|
927
|
+
import { promisify } from "util";
|
|
928
|
+
var execFile = promisify(execFileCallback);
|
|
929
|
+
function normalizePath(value) {
|
|
930
|
+
return value.replace(/\\/g, "/");
|
|
931
|
+
}
|
|
932
|
+
async function literalSearch(args) {
|
|
933
|
+
const config = await loadConfig();
|
|
934
|
+
let roots;
|
|
935
|
+
if (args.rootIds?.length) {
|
|
936
|
+
const resolved = await Promise.all(args.rootIds.map((rootId) => findRoot(rootId)));
|
|
937
|
+
const missingRootIds = args.rootIds.filter((_, index) => !resolved[index]);
|
|
938
|
+
if (missingRootIds.length) {
|
|
939
|
+
throw new OwnSearchError(
|
|
940
|
+
`Unknown root ID(s) for literal search: ${missingRootIds.join(", ")}. Call \`list-roots\` to see valid root IDs.`
|
|
941
|
+
);
|
|
942
|
+
}
|
|
943
|
+
roots = resolved.filter((root) => Boolean(root));
|
|
944
|
+
} else {
|
|
945
|
+
roots = config.roots;
|
|
946
|
+
}
|
|
947
|
+
if (!roots.length) {
|
|
948
|
+
throw new OwnSearchError("No indexed roots are available for literal search. Call `list_roots` or `index_path` first.");
|
|
949
|
+
}
|
|
950
|
+
const limit = Math.max(1, Math.min(args.limit ?? 20, 100));
|
|
951
|
+
const matches = [];
|
|
952
|
+
for (const root of roots) {
|
|
953
|
+
const { stdout } = await execFile(
|
|
954
|
+
"rg",
|
|
955
|
+
[
|
|
956
|
+
"-n",
|
|
957
|
+
"-i",
|
|
958
|
+
"--fixed-strings",
|
|
959
|
+
"--max-count",
|
|
960
|
+
String(limit),
|
|
961
|
+
args.query,
|
|
962
|
+
root.path
|
|
963
|
+
],
|
|
964
|
+
{
|
|
965
|
+
windowsHide: true,
|
|
966
|
+
maxBuffer: 1024 * 1024 * 10
|
|
967
|
+
}
|
|
968
|
+
).catch((error) => {
|
|
969
|
+
if (error?.code === 1) {
|
|
970
|
+
return { stdout: "" };
|
|
971
|
+
}
|
|
972
|
+
throw new OwnSearchError("Literal search failed. Ensure `rg` (ripgrep) is installed and available on PATH.");
|
|
973
|
+
});
|
|
974
|
+
for (const line of stdout.split(/\r?\n/)) {
|
|
975
|
+
if (!line.trim()) {
|
|
976
|
+
continue;
|
|
977
|
+
}
|
|
978
|
+
const match = line.match(/^(.*?):(\d+):(.*)$/);
|
|
979
|
+
if (!match) {
|
|
980
|
+
continue;
|
|
981
|
+
}
|
|
982
|
+
const filePath = match[1];
|
|
983
|
+
const relativePath = normalizePath(path5.relative(root.path, filePath));
|
|
984
|
+
if (args.pathSubstring && !relativePath.toLowerCase().includes(args.pathSubstring.toLowerCase())) {
|
|
985
|
+
continue;
|
|
986
|
+
}
|
|
987
|
+
matches.push({
|
|
988
|
+
rootId: root.id,
|
|
989
|
+
rootName: root.name,
|
|
990
|
+
filePath,
|
|
991
|
+
relativePath,
|
|
992
|
+
lineNumber: Number(match[2]),
|
|
993
|
+
content: match[3].trim()
|
|
994
|
+
});
|
|
995
|
+
if (matches.length >= limit) {
|
|
996
|
+
return matches;
|
|
997
|
+
}
|
|
998
|
+
}
|
|
999
|
+
}
|
|
1000
|
+
return matches;
|
|
1001
|
+
}
|
|
1002
|
+
|
|
1003
|
+
// src/retrieval.ts
|
|
1004
|
+
var LEADING_PATTERNS = [
|
|
1005
|
+
/^(what is|what was|who is|who was)\s+/i,
|
|
1006
|
+
/^(tell me about|explain|summarize|describe)\s+/i,
|
|
1007
|
+
/^(where is|where was|where does|where did)\s+/i,
|
|
1008
|
+
/^(how does|how do|how did|why does|why did)\s+/i
|
|
1009
|
+
];
|
|
1010
|
+
var STOPWORDS = /* @__PURE__ */ new Set([
|
|
1011
|
+
"a",
|
|
1012
|
+
"an",
|
|
1013
|
+
"and",
|
|
1014
|
+
"are",
|
|
1015
|
+
"as",
|
|
1016
|
+
"at",
|
|
1017
|
+
"be",
|
|
1018
|
+
"by",
|
|
1019
|
+
"for",
|
|
1020
|
+
"from",
|
|
1021
|
+
"how",
|
|
1022
|
+
"in",
|
|
1023
|
+
"is",
|
|
1024
|
+
"it",
|
|
1025
|
+
"of",
|
|
1026
|
+
"on",
|
|
1027
|
+
"or",
|
|
1028
|
+
"that",
|
|
1029
|
+
"the",
|
|
1030
|
+
"this",
|
|
1031
|
+
"to",
|
|
1032
|
+
"was",
|
|
1033
|
+
"what",
|
|
1034
|
+
"when",
|
|
1035
|
+
"where",
|
|
1036
|
+
"which",
|
|
1037
|
+
"who",
|
|
1038
|
+
"why"
|
|
1039
|
+
]);
|
|
1040
|
+
function deriveQueryVariants(query) {
|
|
1041
|
+
const normalized = query.trim().replace(/\s+/g, " ");
|
|
1042
|
+
const variants = /* @__PURE__ */ new Set();
|
|
1043
|
+
if (!normalized) {
|
|
1044
|
+
return [];
|
|
1045
|
+
}
|
|
1046
|
+
variants.add(normalized);
|
|
1047
|
+
let stripped = normalized;
|
|
1048
|
+
for (const pattern of LEADING_PATTERNS) {
|
|
1049
|
+
stripped = stripped.replace(pattern, "");
|
|
1050
|
+
}
|
|
1051
|
+
stripped = stripped.replace(/[?.!]+$/g, "").trim();
|
|
1052
|
+
if (stripped && stripped !== normalized) {
|
|
1053
|
+
variants.add(stripped);
|
|
1054
|
+
}
|
|
1055
|
+
const quotedMatches = [...normalized.matchAll(/"([^"]+)"/g)].map((match) => match[1]?.trim()).filter(Boolean);
|
|
1056
|
+
for (const match of quotedMatches) {
|
|
1057
|
+
variants.add(match);
|
|
1058
|
+
}
|
|
1059
|
+
const keywordVariant = normalized.split(/[^A-Za-z0-9_-]+/).filter((token) => token && !STOPWORDS.has(token.toLowerCase())).slice(0, 8).join(" ").trim();
|
|
1060
|
+
if (keywordVariant && keywordVariant !== normalized && keywordVariant !== stripped) {
|
|
1061
|
+
variants.add(keywordVariant);
|
|
1062
|
+
}
|
|
1063
|
+
return [...variants].slice(0, 4);
|
|
1064
|
+
}
|
|
1065
|
+
function diversifyHits(hits, limit) {
|
|
1066
|
+
const seenIds = /* @__PURE__ */ new Set();
|
|
1067
|
+
const fileCounts = /* @__PURE__ */ new Map();
|
|
1068
|
+
const diversified = [];
|
|
1069
|
+
const sorted = [...hits].sort((a, b) => {
|
|
1070
|
+
const aCount = fileCounts.get(a.relativePath) ?? 0;
|
|
1071
|
+
const bCount = fileCounts.get(b.relativePath) ?? 0;
|
|
1072
|
+
const aScore = a.score - aCount * 0.015;
|
|
1073
|
+
const bScore = b.score - bCount * 0.015;
|
|
1074
|
+
return bScore - aScore;
|
|
1075
|
+
});
|
|
1076
|
+
for (const hit of sorted) {
|
|
1077
|
+
if (seenIds.has(hit.id)) {
|
|
1078
|
+
continue;
|
|
1079
|
+
}
|
|
1080
|
+
const count = fileCounts.get(hit.relativePath) ?? 0;
|
|
1081
|
+
if (count >= 3 && diversified.length >= Math.max(3, Math.floor(limit / 2))) {
|
|
1082
|
+
continue;
|
|
1083
|
+
}
|
|
1084
|
+
diversified.push(hit);
|
|
1085
|
+
seenIds.add(hit.id);
|
|
1086
|
+
fileCounts.set(hit.relativePath, count + 1);
|
|
1087
|
+
if (diversified.length >= limit) {
|
|
1088
|
+
break;
|
|
1089
|
+
}
|
|
1090
|
+
}
|
|
1091
|
+
return diversified;
|
|
1092
|
+
}
|
|
1093
|
+
async function deepSearchContext(query, options = {}) {
|
|
1094
|
+
const store = await createStore();
|
|
1095
|
+
const variants = deriveQueryVariants(query);
|
|
1096
|
+
const allHits = [];
|
|
1097
|
+
for (const variant of variants) {
|
|
1098
|
+
const vector = await embedQuery(variant);
|
|
1099
|
+
const hits = await store.search(
|
|
1100
|
+
vector,
|
|
1101
|
+
{
|
|
1102
|
+
queryText: variant,
|
|
1103
|
+
rootIds: options.rootIds,
|
|
1104
|
+
pathSubstring: options.pathSubstring
|
|
1105
|
+
},
|
|
1106
|
+
Math.max(1, Math.min(options.perQueryLimit ?? 6, 12))
|
|
1107
|
+
);
|
|
1108
|
+
allHits.push(...hits);
|
|
1109
|
+
}
|
|
1110
|
+
const finalHits = diversifyHits(allHits, Math.max(1, Math.min(options.finalLimit ?? 10, 20)));
|
|
1111
|
+
const bundle = buildContextBundle(query, finalHits, Math.max(500, options.maxChars ?? 16e3));
|
|
1112
|
+
return {
|
|
1113
|
+
query,
|
|
1114
|
+
queryVariants: variants,
|
|
1115
|
+
hitCount: finalHits.length,
|
|
1116
|
+
distinctFiles: [...new Set(finalHits.map((hit) => hit.relativePath))].length,
|
|
1117
|
+
bundle
|
|
1118
|
+
};
|
|
1119
|
+
}
|
|
1120
|
+
|
|
894
1121
|
export {
|
|
895
1122
|
buildContextBundle,
|
|
896
1123
|
getConfigPath,
|
|
@@ -905,6 +1132,9 @@ export {
|
|
|
905
1132
|
listRoots,
|
|
906
1133
|
OwnSearchError,
|
|
907
1134
|
embedQuery,
|
|
1135
|
+
validateGeminiApiKey,
|
|
908
1136
|
createStore,
|
|
909
|
-
indexPath
|
|
1137
|
+
indexPath,
|
|
1138
|
+
literalSearch,
|
|
1139
|
+
deepSearchContext
|
|
910
1140
|
};
|
package/dist/cli.js
CHANGED
|
@@ -3,6 +3,7 @@ import {
|
|
|
3
3
|
OwnSearchError,
|
|
4
4
|
buildContextBundle,
|
|
5
5
|
createStore,
|
|
6
|
+
deepSearchContext,
|
|
6
7
|
deleteRootDefinition,
|
|
7
8
|
embedQuery,
|
|
8
9
|
findRoot,
|
|
@@ -11,11 +12,13 @@ import {
|
|
|
11
12
|
getEnvPath,
|
|
12
13
|
indexPath,
|
|
13
14
|
listRoots,
|
|
15
|
+
literalSearch,
|
|
14
16
|
loadConfig,
|
|
15
17
|
loadOwnSearchEnv,
|
|
16
18
|
readEnvFile,
|
|
17
|
-
saveGeminiApiKey
|
|
18
|
-
|
|
19
|
+
saveGeminiApiKey,
|
|
20
|
+
validateGeminiApiKey
|
|
21
|
+
} from "./chunk-TBXFY4OJ.js";
|
|
19
22
|
|
|
20
23
|
// src/cli.ts
|
|
21
24
|
import fs from "fs/promises";
|
|
@@ -29,12 +32,16 @@ import { Command } from "commander";
|
|
|
29
32
|
import { execFile } from "child_process";
|
|
30
33
|
import { promisify } from "util";
|
|
31
34
|
var execFileAsync = promisify(execFile);
|
|
35
|
+
var DOCKER_DESKTOP_WINDOWS_URL = "https://docs.docker.com/desktop/setup/install/windows-install/";
|
|
36
|
+
var DOCKER_DESKTOP_OVERVIEW_URL = "https://docs.docker.com/desktop/";
|
|
32
37
|
async function runDocker(args) {
|
|
33
38
|
try {
|
|
34
39
|
const { stdout } = await execFileAsync("docker", args, { windowsHide: true });
|
|
35
40
|
return stdout.trim();
|
|
36
41
|
} catch (error) {
|
|
37
|
-
throw new OwnSearchError(
|
|
42
|
+
throw new OwnSearchError(
|
|
43
|
+
`Docker is required for Qdrant setup. Install Docker Desktop and ensure \`docker\` is on PATH. Windows install guide: ${DOCKER_DESKTOP_WINDOWS_URL} General Docker Desktop docs: ${DOCKER_DESKTOP_OVERVIEW_URL}`
|
|
44
|
+
);
|
|
38
45
|
}
|
|
39
46
|
}
|
|
40
47
|
async function ensureQdrantDocker() {
|
|
@@ -71,6 +78,8 @@ loadOwnSearchEnv();
|
|
|
71
78
|
var program = new Command();
|
|
72
79
|
var PACKAGE_NAME = "ownsearch";
|
|
73
80
|
var GEMINI_API_KEY_URL = "https://aistudio.google.com/apikey";
|
|
81
|
+
var DOCKER_DESKTOP_WINDOWS_URL2 = "https://docs.docker.com/desktop/setup/install/windows-install/";
|
|
82
|
+
var DOCKER_DESKTOP_OVERVIEW_URL2 = "https://docs.docker.com/desktop/";
|
|
74
83
|
var BUNDLED_SKILL_NAME = "ownsearch-rag-search";
|
|
75
84
|
var SUPPORTED_AGENTS = [
|
|
76
85
|
"codex",
|
|
@@ -90,10 +99,7 @@ function requireGeminiKey() {
|
|
|
90
99
|
function buildAgentConfig(agent) {
|
|
91
100
|
const stdioConfig = {
|
|
92
101
|
command: "npx",
|
|
93
|
-
args: ["-y", PACKAGE_NAME, "serve-mcp"]
|
|
94
|
-
env: {
|
|
95
|
-
GEMINI_API_KEY: "${GEMINI_API_KEY}"
|
|
96
|
-
}
|
|
102
|
+
args: ["-y", PACKAGE_NAME, "serve-mcp"]
|
|
97
103
|
};
|
|
98
104
|
switch (agent) {
|
|
99
105
|
case "codex":
|
|
@@ -126,7 +132,6 @@ function buildAgentConfig(agent) {
|
|
|
126
132
|
type: "local",
|
|
127
133
|
command: stdioConfig.command,
|
|
128
134
|
args: stdioConfig.args,
|
|
129
|
-
env: stdioConfig.env,
|
|
130
135
|
tools: ["*"]
|
|
131
136
|
}
|
|
132
137
|
}
|
|
@@ -203,22 +208,59 @@ async function promptForGeminiKey() {
|
|
|
203
208
|
output: process.stdout
|
|
204
209
|
});
|
|
205
210
|
try {
|
|
206
|
-
console.log(`
|
|
207
|
-
console.log(
|
|
211
|
+
console.log(`OwnSearch needs a Gemini API key for indexing and search.`);
|
|
212
|
+
console.log("Gemini API usage is governed by Google\u2019s current free-tier limits, quotas, and pricing.");
|
|
213
|
+
console.log(`Open Google AI Studio here: ${GEMINI_API_KEY_URL}`);
|
|
214
|
+
console.log(`OwnSearch will save the key to ${getEnvPath()}`);
|
|
215
|
+
openGeminiKeyPage();
|
|
216
|
+
await rl.question("Press Enter after the AI Studio page is open and you are ready to paste the key: ");
|
|
208
217
|
for (; ; ) {
|
|
209
218
|
const apiKey = (await rl.question("Paste GEMINI_API_KEY and press Enter (Ctrl+C to cancel): ")).trim();
|
|
210
219
|
if (!apiKey) {
|
|
211
220
|
console.log("GEMINI_API_KEY is required for indexing and search.");
|
|
212
221
|
continue;
|
|
213
222
|
}
|
|
214
|
-
await saveGeminiApiKey(apiKey);
|
|
215
223
|
process.env.GEMINI_API_KEY = apiKey;
|
|
224
|
+
process.env.GOOGLE_API_KEY = apiKey;
|
|
225
|
+
process.stdout.write("Validating key with Gemini...");
|
|
226
|
+
try {
|
|
227
|
+
await validateGeminiApiKey(apiKey);
|
|
228
|
+
process.stdout.write(" ok\n");
|
|
229
|
+
} catch (error) {
|
|
230
|
+
process.stdout.write(" failed\n");
|
|
231
|
+
console.log(error instanceof Error ? error.message : String(error));
|
|
232
|
+
continue;
|
|
233
|
+
}
|
|
234
|
+
await saveGeminiApiKey(apiKey);
|
|
216
235
|
return true;
|
|
217
236
|
}
|
|
218
237
|
} finally {
|
|
219
238
|
rl.close();
|
|
220
239
|
}
|
|
221
240
|
}
|
|
241
|
+
function openGeminiKeyPage() {
|
|
242
|
+
try {
|
|
243
|
+
if (process.platform === "win32") {
|
|
244
|
+
spawn("cmd", ["/c", "start", "", GEMINI_API_KEY_URL], {
|
|
245
|
+
stdio: "ignore",
|
|
246
|
+
detached: true
|
|
247
|
+
}).unref();
|
|
248
|
+
return;
|
|
249
|
+
}
|
|
250
|
+
if (process.platform === "darwin") {
|
|
251
|
+
spawn("open", [GEMINI_API_KEY_URL], {
|
|
252
|
+
stdio: "ignore",
|
|
253
|
+
detached: true
|
|
254
|
+
}).unref();
|
|
255
|
+
return;
|
|
256
|
+
}
|
|
257
|
+
spawn("xdg-open", [GEMINI_API_KEY_URL], {
|
|
258
|
+
stdio: "ignore",
|
|
259
|
+
detached: true
|
|
260
|
+
}).unref();
|
|
261
|
+
} catch {
|
|
262
|
+
}
|
|
263
|
+
}
|
|
222
264
|
function getGeminiApiKeySource() {
|
|
223
265
|
if (readEnvFile(getEnvPath()).GEMINI_API_KEY) {
|
|
224
266
|
return "ownsearch-env";
|
|
@@ -247,28 +289,81 @@ async function ensureManagedGeminiKey() {
|
|
|
247
289
|
savedToManagedEnv: prompted
|
|
248
290
|
};
|
|
249
291
|
}
|
|
292
|
+
async function promptForSetupAudience() {
|
|
293
|
+
if (!process.stdin.isTTY || !process.stdout.isTTY) {
|
|
294
|
+
return "agent";
|
|
295
|
+
}
|
|
296
|
+
const rl = readline.createInterface({
|
|
297
|
+
input: process.stdin,
|
|
298
|
+
output: process.stdout
|
|
299
|
+
});
|
|
300
|
+
try {
|
|
301
|
+
console.log("");
|
|
302
|
+
console.log("Who is running setup?");
|
|
303
|
+
console.log(" 1. Human");
|
|
304
|
+
console.log(" 2. Agent");
|
|
305
|
+
for (; ; ) {
|
|
306
|
+
const answer = (await rl.question("Select 1-2: ")).trim().toLowerCase();
|
|
307
|
+
switch (answer) {
|
|
308
|
+
case "1":
|
|
309
|
+
case "human":
|
|
310
|
+
return "human";
|
|
311
|
+
case "2":
|
|
312
|
+
case "agent":
|
|
313
|
+
return "agent";
|
|
314
|
+
default:
|
|
315
|
+
console.log("Enter 1 or 2.");
|
|
316
|
+
}
|
|
317
|
+
}
|
|
318
|
+
} finally {
|
|
319
|
+
rl.close();
|
|
320
|
+
}
|
|
321
|
+
}
|
|
250
322
|
function printSetupNextSteps() {
|
|
251
323
|
console.log("");
|
|
252
|
-
console.log("Next
|
|
253
|
-
console.log("
|
|
324
|
+
console.log("Next steps");
|
|
325
|
+
console.log(" 1. Index a folder:");
|
|
326
|
+
console.log(" ownsearch index C:\\path\\to\\folder --name my-folder");
|
|
327
|
+
console.log(" 2. Test exact-match search in the CLI:");
|
|
328
|
+
console.log(' ownsearch literal-search "exact title or phrase" --limit 10');
|
|
329
|
+
console.log(" 3. Test semantic search in the CLI:");
|
|
330
|
+
console.log(' ownsearch search "your question here" --limit 5');
|
|
331
|
+
console.log(" 4. Get grounded context for an agent:");
|
|
332
|
+
console.log(' ownsearch search-context "your question here" --limit 8 --max-chars 12000');
|
|
333
|
+
console.log(" 5. Use deeper retrieval for archive-style questions:");
|
|
334
|
+
console.log(' ownsearch deep-search-context "your question here" --final-limit 10 --max-chars 16000');
|
|
335
|
+
console.log(" 6. Start the MCP server:");
|
|
336
|
+
console.log(" ownsearch serve-mcp");
|
|
337
|
+
console.log(" 7. Print agent-specific config:");
|
|
338
|
+
console.log(" ownsearch print-agent-config codex");
|
|
339
|
+
console.log(" 8. Print the bundled retrieval skill:");
|
|
340
|
+
console.log(` ownsearch print-skill ${BUNDLED_SKILL_NAME}`);
|
|
341
|
+
console.log("");
|
|
342
|
+
console.log("Docker requirement");
|
|
343
|
+
console.log(" OwnSearch requires Docker Desktop so it can run Qdrant locally.");
|
|
344
|
+
console.log(` Windows install: ${DOCKER_DESKTOP_WINDOWS_URL2}`);
|
|
345
|
+
console.log(` Docker docs: ${DOCKER_DESKTOP_OVERVIEW_URL2}`);
|
|
346
|
+
}
|
|
347
|
+
function printAgentSetupNextSteps() {
|
|
348
|
+
console.log("");
|
|
349
|
+
console.log("Agent-ready commands");
|
|
350
|
+
console.log(" Index an approved folder:");
|
|
254
351
|
console.log(" ownsearch index C:\\path\\to\\folder --name my-folder");
|
|
255
|
-
console.log("
|
|
256
|
-
console.log(' ownsearch search "
|
|
257
|
-
console.log("
|
|
352
|
+
console.log(" For exact names, titles, IDs, or quoted strings:");
|
|
353
|
+
console.log(' ownsearch literal-search "exact text here" --limit 10');
|
|
354
|
+
console.log(" Retrieve grounded context:");
|
|
258
355
|
console.log(' ownsearch search-context "your question here" --limit 8 --max-chars 12000');
|
|
259
|
-
console.log("
|
|
356
|
+
console.log(" Use deeper retrieval for ambiguous or multi-document questions:");
|
|
357
|
+
console.log(' ownsearch deep-search-context "your question here" --final-limit 10 --max-chars 16000');
|
|
358
|
+
console.log(" Start the MCP server:");
|
|
260
359
|
console.log(" ownsearch serve-mcp");
|
|
261
|
-
console.log("
|
|
360
|
+
console.log(" Print MCP config for the host agent:");
|
|
262
361
|
console.log(" ownsearch print-agent-config codex");
|
|
263
|
-
console.log("
|
|
264
|
-
console.log("
|
|
265
|
-
console.log("
|
|
266
|
-
console.log(
|
|
267
|
-
console.log(
|
|
268
|
-
console.log(" ownsearch print-agent-config windsurf");
|
|
269
|
-
console.log(" ownsearch print-agent-config continue");
|
|
270
|
-
console.log(" Bundled retrieval skill:");
|
|
271
|
-
console.log(` ownsearch print-skill ${BUNDLED_SKILL_NAME}`);
|
|
362
|
+
console.log("");
|
|
363
|
+
console.log("Docker requirement");
|
|
364
|
+
console.log(" OwnSearch requires Docker Desktop so it can run Qdrant locally.");
|
|
365
|
+
console.log(` Windows install: ${DOCKER_DESKTOP_WINDOWS_URL2}`);
|
|
366
|
+
console.log(` Docker docs: ${DOCKER_DESKTOP_OVERVIEW_URL2}`);
|
|
272
367
|
}
|
|
273
368
|
async function promptForAgentChoice() {
|
|
274
369
|
if (!process.stdin.isTTY || !process.stdout.isTTY) {
|
|
@@ -329,16 +424,65 @@ async function promptForAgentChoice() {
|
|
|
329
424
|
}
|
|
330
425
|
}
|
|
331
426
|
function printAgentConfigSnippet(agent) {
|
|
427
|
+
const payload = buildAgentConfig(agent);
|
|
332
428
|
console.log("");
|
|
333
|
-
console.log(`
|
|
334
|
-
|
|
429
|
+
console.log(`Connect OwnSearch to ${agent}`);
|
|
430
|
+
if (payload.installMethod) {
|
|
431
|
+
console.log(` Recommended install method: ${payload.installMethod}`);
|
|
432
|
+
}
|
|
433
|
+
if (payload.configPath) {
|
|
434
|
+
console.log(` Config file: ${payload.configPath}`);
|
|
435
|
+
}
|
|
436
|
+
if (payload.configScope) {
|
|
437
|
+
console.log(` Scope: ${payload.configScope}`);
|
|
438
|
+
}
|
|
439
|
+
if (payload.note) {
|
|
440
|
+
console.log(` Note: ${payload.note}`);
|
|
441
|
+
}
|
|
442
|
+
if (payload.nextStep) {
|
|
443
|
+
console.log(` Next step: ${payload.nextStep}`);
|
|
444
|
+
}
|
|
445
|
+
if (payload.config) {
|
|
446
|
+
console.log("");
|
|
447
|
+
console.log("Paste this config:");
|
|
448
|
+
console.log(JSON.stringify(payload.config, null, 2));
|
|
449
|
+
console.log("");
|
|
450
|
+
console.log(`OwnSearch will load GEMINI_API_KEY from ${getEnvPath()} if you ran \`ownsearch setup\`.`);
|
|
451
|
+
}
|
|
452
|
+
}
|
|
453
|
+
function printSetupSummary(input) {
|
|
454
|
+
console.log("OwnSearch setup complete");
|
|
455
|
+
console.log(" Docker is required because OwnSearch runs Qdrant locally in Docker.");
|
|
456
|
+
console.log(` Docker docs: ${DOCKER_DESKTOP_WINDOWS_URL2}`);
|
|
457
|
+
console.log(` Config: ${input.configPath}`);
|
|
458
|
+
console.log(` API key file: ${input.envPath}`);
|
|
459
|
+
console.log(` Qdrant: ${input.qdrantUrl} (${input.qdrantStarted ? "started now" : "already running or reachable"})`);
|
|
460
|
+
if (input.geminiApiKeyPresent) {
|
|
461
|
+
console.log(` Gemini API key: ready (${input.geminiApiKeySource})`);
|
|
462
|
+
if (input.geminiApiKeySavedToManagedEnv) {
|
|
463
|
+
console.log(" Saved your key to the managed OwnSearch env file.");
|
|
464
|
+
}
|
|
465
|
+
} else {
|
|
466
|
+
console.log(" Gemini API key: missing");
|
|
467
|
+
}
|
|
335
468
|
}
|
|
336
|
-
|
|
337
|
-
|
|
469
|
+
function printAgentSetupSummary(input) {
|
|
470
|
+
console.log("OwnSearch setup ready for agent use");
|
|
471
|
+
console.log(" Docker is required because OwnSearch runs Qdrant locally in Docker.");
|
|
472
|
+
console.log(` Docker docs: ${DOCKER_DESKTOP_WINDOWS_URL2}`);
|
|
473
|
+
console.log(` Config path: ${input.configPath}`);
|
|
474
|
+
console.log(` Managed env path: ${input.envPath}`);
|
|
475
|
+
console.log(` Qdrant endpoint: ${input.qdrantUrl}`);
|
|
476
|
+
console.log(` Qdrant status: ${input.qdrantStarted ? "started during setup" : "already reachable"}`);
|
|
477
|
+
console.log(` Gemini key: ${input.geminiApiKeyPresent ? `ready (${input.geminiApiKeySource})` : "missing"}`);
|
|
478
|
+
}
|
|
479
|
+
program.name("ownsearch").description("Gemini-powered local search MCP server backed by Qdrant.").version("0.1.4");
|
|
480
|
+
program.command("setup").description("Create config and start a local Qdrant Docker container.").option("--json", "Print machine-readable JSON output").option("--audience <audience>", "Choose output style: human or agent").action(async (options) => {
|
|
338
481
|
const config = await loadConfig();
|
|
339
482
|
const result = await ensureQdrantDocker();
|
|
340
483
|
const gemini = await ensureManagedGeminiKey();
|
|
341
|
-
|
|
484
|
+
const audience = options.json ? "agent" : options.audience === "human" || options.audience === "agent" ? options.audience : await promptForSetupAudience();
|
|
485
|
+
const summary = {
|
|
342
486
|
configPath: getConfigPath(),
|
|
343
487
|
envPath: getEnvPath(),
|
|
344
488
|
qdrantUrl: config.qdrantUrl,
|
|
@@ -346,15 +490,27 @@ program.command("setup").description("Create config and start a local Qdrant Doc
|
|
|
346
490
|
geminiApiKeyPresent: gemini.present,
|
|
347
491
|
geminiApiKeySource: gemini.source,
|
|
348
492
|
geminiApiKeySavedToManagedEnv: gemini.savedToManagedEnv
|
|
349
|
-
}
|
|
493
|
+
};
|
|
494
|
+
if (options.json) {
|
|
495
|
+
console.log(JSON.stringify(summary, null, 2));
|
|
496
|
+
return;
|
|
497
|
+
} else if (audience === "agent") {
|
|
498
|
+
printAgentSetupSummary(summary);
|
|
499
|
+
} else {
|
|
500
|
+
printSetupSummary(summary);
|
|
501
|
+
}
|
|
350
502
|
if (!gemini.present) {
|
|
351
503
|
console.log(`GEMINI_API_KEY is not set. Re-run setup or add it to ${getEnvPath()} before indexing or search.`);
|
|
352
504
|
return;
|
|
353
505
|
}
|
|
354
|
-
|
|
355
|
-
|
|
356
|
-
|
|
357
|
-
|
|
506
|
+
if (audience === "agent") {
|
|
507
|
+
printAgentSetupNextSteps();
|
|
508
|
+
} else {
|
|
509
|
+
printSetupNextSteps();
|
|
510
|
+
const agent = await promptForAgentChoice();
|
|
511
|
+
if (agent) {
|
|
512
|
+
printAgentConfigSnippet(agent);
|
|
513
|
+
}
|
|
358
514
|
}
|
|
359
515
|
});
|
|
360
516
|
program.command("index").argument("<folder>", "Folder path to index").option("-n, --name <name>", "Display name for the indexed root").option("--max-file-bytes <n>", "Override the file size limit for this run", (value) => Number(value)).description("Index a local folder into Qdrant using Gemini embeddings.").action(async (folder, options) => {
|
|
@@ -382,6 +538,17 @@ program.command("search").argument("<query>", "Natural language query").option("
|
|
|
382
538
|
console.log(JSON.stringify({ query, hits }, null, 2));
|
|
383
539
|
}
|
|
384
540
|
);
|
|
541
|
+
program.command("literal-search").argument("<query>", "Exact text query").option("--root-id <rootId...>", "Restrict search to one or more root IDs (repeatable)").option("--limit <n>", "Max matches (default 20)", (value) => Number(value), 20).option("--path <substr>", "Filter results to files whose relative path contains this substring").description("Run grep-style exact text search over indexed roots with ripgrep.").action(
|
|
542
|
+
async (query, options) => {
|
|
543
|
+
const matches = await literalSearch({
|
|
544
|
+
query,
|
|
545
|
+
rootIds: options.rootId,
|
|
546
|
+
limit: Math.max(1, Math.min(options.limit ?? 20, 100)),
|
|
547
|
+
pathSubstring: options.path
|
|
548
|
+
});
|
|
549
|
+
console.log(JSON.stringify({ query, matches }, null, 2));
|
|
550
|
+
}
|
|
551
|
+
);
|
|
385
552
|
program.command("search-context").argument("<query>", "Natural language query").option("--root-id <rootId...>", "Restrict search to one or more root IDs (repeatable)").option("--limit <n>", "Max search hits to consider (default 8)", (value) => Number(value), 8).option("--max-chars <n>", "Max context characters to return (default 12000)", (value) => Number(value), 12e3).option("--path <substr>", "Filter results to files whose relative path contains this substring").description("Search the local Qdrant store and return a bundled context payload for agent use.").action(
|
|
386
553
|
async (query, options) => {
|
|
387
554
|
requireGeminiKey();
|
|
@@ -399,6 +566,19 @@ program.command("search-context").argument("<query>", "Natural language query").
|
|
|
399
566
|
console.log(JSON.stringify(buildContextBundle(query, hits, Math.max(500, options.maxChars ?? 12e3)), null, 2));
|
|
400
567
|
}
|
|
401
568
|
);
|
|
569
|
+
program.command("deep-search-context").argument("<query>", "Natural language query").option("--root-id <rootId...>", "Restrict search to one or more root IDs (repeatable)").option("--per-query-limit <n>", "Max hits per query variant (default 6)", (value) => Number(value), 6).option("--final-limit <n>", "Max aggregated result blocks (default 10)", (value) => Number(value), 10).option("--max-chars <n>", "Max context characters to return (default 16000)", (value) => Number(value), 16e3).option("--path <substr>", "Filter results to files whose relative path contains this substring").description("Run a deeper multi-query retrieval pass for ambiguous or archive-style questions.").action(
|
|
570
|
+
async (query, options) => {
|
|
571
|
+
requireGeminiKey();
|
|
572
|
+
const result = await deepSearchContext(query, {
|
|
573
|
+
rootIds: options.rootId,
|
|
574
|
+
pathSubstring: options.path,
|
|
575
|
+
perQueryLimit: Math.max(1, Math.min(options.perQueryLimit ?? 6, 12)),
|
|
576
|
+
finalLimit: Math.max(1, Math.min(options.finalLimit ?? 10, 20)),
|
|
577
|
+
maxChars: Math.max(500, options.maxChars ?? 16e3)
|
|
578
|
+
});
|
|
579
|
+
console.log(JSON.stringify(result, null, 2));
|
|
580
|
+
}
|
|
581
|
+
);
|
|
402
582
|
program.command("list-roots").description("List indexed roots registered in local config.").action(async () => {
|
|
403
583
|
console.log(JSON.stringify({ roots: await listRoots() }, null, 2));
|
|
404
584
|
});
|
|
@@ -461,9 +641,14 @@ program.command("serve-mcp").description("Start the stdio MCP server.").action(a
|
|
|
461
641
|
process.exitCode = code ?? 0;
|
|
462
642
|
});
|
|
463
643
|
});
|
|
464
|
-
program.command("print-agent-config").argument("<agent>", SUPPORTED_AGENTS.join(" | ")).description("Print an MCP config snippet for a supported agent.").action(async (agent) => {
|
|
644
|
+
program.command("print-agent-config").argument("<agent>", SUPPORTED_AGENTS.join(" | ")).description("Print an MCP config snippet for a supported agent.").option("--json", "Print the full machine-readable payload").action(async (agent, options) => {
|
|
465
645
|
if (SUPPORTED_AGENTS.includes(agent)) {
|
|
466
|
-
|
|
646
|
+
const payload = buildAgentConfig(agent);
|
|
647
|
+
if (options.json) {
|
|
648
|
+
console.log(JSON.stringify(payload, null, 2));
|
|
649
|
+
return;
|
|
650
|
+
}
|
|
651
|
+
printAgentConfigSnippet(agent);
|
|
467
652
|
return;
|
|
468
653
|
}
|
|
469
654
|
throw new OwnSearchError(`Unsupported agent: ${agent}`);
|
package/dist/mcp/server.js
CHANGED
|
@@ -3,19 +3,26 @@ import {
|
|
|
3
3
|
OwnSearchError,
|
|
4
4
|
buildContextBundle,
|
|
5
5
|
createStore,
|
|
6
|
+
deepSearchContext,
|
|
6
7
|
deleteRootDefinition,
|
|
7
8
|
embedQuery,
|
|
8
9
|
findRoot,
|
|
9
10
|
indexPath,
|
|
11
|
+
literalSearch,
|
|
10
12
|
loadConfig,
|
|
11
13
|
loadOwnSearchEnv
|
|
12
|
-
} from "../chunk-
|
|
14
|
+
} from "../chunk-TBXFY4OJ.js";
|
|
13
15
|
|
|
14
16
|
// src/mcp/server.ts
|
|
17
|
+
import fs from "fs/promises";
|
|
18
|
+
import path from "path";
|
|
15
19
|
import { Server } from "@modelcontextprotocol/sdk/server/index.js";
|
|
16
20
|
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
|
|
17
21
|
import { CallToolRequestSchema, ListToolsRequestSchema } from "@modelcontextprotocol/sdk/types.js";
|
|
22
|
+
import { fileURLToPath } from "url";
|
|
18
23
|
loadOwnSearchEnv();
|
|
24
|
+
var BUNDLED_SKILL_NAME = "ownsearch-rag-search";
|
|
25
|
+
var SERVER_VERSION = "0.1.5";
|
|
19
26
|
function asText(result) {
|
|
20
27
|
return {
|
|
21
28
|
content: [
|
|
@@ -26,14 +33,79 @@ function asText(result) {
|
|
|
26
33
|
]
|
|
27
34
|
};
|
|
28
35
|
}
|
|
36
|
+
function withGuidance(summary, data, nextActions = []) {
|
|
37
|
+
return asText({
|
|
38
|
+
summary,
|
|
39
|
+
nextActions,
|
|
40
|
+
data
|
|
41
|
+
});
|
|
42
|
+
}
|
|
43
|
+
async function readBundledSkill(skillName) {
|
|
44
|
+
const currentFilePath = fileURLToPath(import.meta.url);
|
|
45
|
+
const packageRoot = path.resolve(path.dirname(currentFilePath), "..", "..");
|
|
46
|
+
const skillPath = path.join(packageRoot, "skills", skillName, "SKILL.md");
|
|
47
|
+
return fs.readFile(skillPath, "utf8");
|
|
48
|
+
}
|
|
49
|
+
function diagnoseError(message) {
|
|
50
|
+
const lower = message.toLowerCase();
|
|
51
|
+
if (lower.includes("gemini_api_key")) {
|
|
52
|
+
return {
|
|
53
|
+
summary: "Gemini API key is missing.",
|
|
54
|
+
nextActions: [
|
|
55
|
+
"Run `ownsearch setup` in a normal terminal and complete Gemini key setup.",
|
|
56
|
+
"If this MCP server is running in a restricted environment, ensure it can read ~/.ownsearch/.env or receive GEMINI_API_KEY in its process environment."
|
|
57
|
+
]
|
|
58
|
+
};
|
|
59
|
+
}
|
|
60
|
+
if (lower.includes("fetch failed") || lower.includes("network") || lower.includes("timeout")) {
|
|
61
|
+
return {
|
|
62
|
+
summary: "OwnSearch could not reach Gemini or Qdrant from this execution environment.",
|
|
63
|
+
nextActions: [
|
|
64
|
+
"Check whether the MCP server is running in a sandboxed or restricted environment.",
|
|
65
|
+
"Verify Gemini API access works in a normal terminal with `ownsearch doctor`.",
|
|
66
|
+
"Verify local Qdrant is reachable at the configured URL."
|
|
67
|
+
]
|
|
68
|
+
};
|
|
69
|
+
}
|
|
70
|
+
if (lower.includes("unknown root")) {
|
|
71
|
+
return {
|
|
72
|
+
summary: "The requested root ID does not exist.",
|
|
73
|
+
nextActions: [
|
|
74
|
+
"Call `list_roots` to get valid root IDs.",
|
|
75
|
+
"If the folder was not indexed yet, call `index_path` first."
|
|
76
|
+
]
|
|
77
|
+
};
|
|
78
|
+
}
|
|
79
|
+
if (lower.includes("qdrant")) {
|
|
80
|
+
return {
|
|
81
|
+
summary: "Qdrant is not reachable or is misconfigured.",
|
|
82
|
+
nextActions: [
|
|
83
|
+
"Run `ownsearch setup` or `ownsearch doctor` in a normal terminal.",
|
|
84
|
+
"Confirm Docker is running and Qdrant is reachable at the configured URL."
|
|
85
|
+
]
|
|
86
|
+
};
|
|
87
|
+
}
|
|
88
|
+
return {
|
|
89
|
+
summary: "OwnSearch tool call failed.",
|
|
90
|
+
nextActions: [
|
|
91
|
+
"Inspect the error message below.",
|
|
92
|
+
"If this is an environment issue, retry in a normal terminal outside the agent sandbox."
|
|
93
|
+
]
|
|
94
|
+
};
|
|
95
|
+
}
|
|
29
96
|
function asError(error) {
|
|
30
97
|
const message = error instanceof Error ? error.message : String(error);
|
|
98
|
+
const diagnosis = diagnoseError(message);
|
|
31
99
|
return {
|
|
32
100
|
isError: true,
|
|
33
101
|
content: [
|
|
34
102
|
{
|
|
35
103
|
type: "text",
|
|
36
|
-
text:
|
|
104
|
+
text: JSON.stringify({
|
|
105
|
+
summary: diagnosis.summary,
|
|
106
|
+
error: message,
|
|
107
|
+
nextActions: diagnosis.nextActions
|
|
108
|
+
}, null, 2)
|
|
37
109
|
}
|
|
38
110
|
]
|
|
39
111
|
};
|
|
@@ -41,7 +113,7 @@ function asError(error) {
|
|
|
41
113
|
var server = new Server(
|
|
42
114
|
{
|
|
43
115
|
name: "ownsearch",
|
|
44
|
-
version:
|
|
116
|
+
version: SERVER_VERSION
|
|
45
117
|
},
|
|
46
118
|
{
|
|
47
119
|
capabilities: {
|
|
@@ -51,9 +123,22 @@ var server = new Server(
|
|
|
51
123
|
);
|
|
52
124
|
server.setRequestHandler(ListToolsRequestSchema, async () => ({
|
|
53
125
|
tools: [
|
|
126
|
+
{
|
|
127
|
+
name: "get_retrieval_skill",
|
|
128
|
+
description: "Read the bundled OwnSearch retrieval skill. Call this first if you want explicit guidance on query rewriting, search strategy, grounded answering, and failure recovery.",
|
|
129
|
+
inputSchema: {
|
|
130
|
+
type: "object",
|
|
131
|
+
properties: {
|
|
132
|
+
skillName: {
|
|
133
|
+
type: "string",
|
|
134
|
+
description: `Optional skill name. Default is ${BUNDLED_SKILL_NAME}.`
|
|
135
|
+
}
|
|
136
|
+
}
|
|
137
|
+
}
|
|
138
|
+
},
|
|
54
139
|
{
|
|
55
140
|
name: "index_path",
|
|
56
|
-
description: "
|
|
141
|
+
description: "Index an approved local folder recursively, including nested subfolders. Use this before search. Returns the registered root and indexing counts. For best retrieval behavior, read `get_retrieval_skill` once before planning search calls.",
|
|
57
142
|
inputSchema: {
|
|
58
143
|
type: "object",
|
|
59
144
|
properties: {
|
|
@@ -65,7 +150,7 @@ server.setRequestHandler(ListToolsRequestSchema, async () => ({
|
|
|
65
150
|
},
|
|
66
151
|
{
|
|
67
152
|
name: "search",
|
|
68
|
-
description: "Semantic search over one root or the full local
|
|
153
|
+
description: "Semantic search over one root or the full local store. Use `rootIds` when you want deterministic scope. If you do not know the root ID yet, call `list_roots` first.",
|
|
69
154
|
inputSchema: {
|
|
70
155
|
type: "object",
|
|
71
156
|
properties: {
|
|
@@ -81,9 +166,27 @@ server.setRequestHandler(ListToolsRequestSchema, async () => ({
|
|
|
81
166
|
required: ["query"]
|
|
82
167
|
}
|
|
83
168
|
},
|
|
169
|
+
{
|
|
170
|
+
name: "literal_search",
|
|
171
|
+
description: "Exact text search backed by ripgrep. Prefer this for strong keywords, exact names, IDs, error strings, titles, or other literal queries where grep-style matching is better than semantic retrieval.",
|
|
172
|
+
inputSchema: {
|
|
173
|
+
type: "object",
|
|
174
|
+
properties: {
|
|
175
|
+
query: { type: "string", description: "Exact text to search for." },
|
|
176
|
+
rootIds: {
|
|
177
|
+
type: "array",
|
|
178
|
+
items: { type: "string" },
|
|
179
|
+
description: "Optional list of root IDs to restrict search."
|
|
180
|
+
},
|
|
181
|
+
pathSubstring: { type: "string", description: "Optional file path substring filter." },
|
|
182
|
+
limit: { type: "number", description: "Maximum result count. Default 20." }
|
|
183
|
+
},
|
|
184
|
+
required: ["query"]
|
|
185
|
+
}
|
|
186
|
+
},
|
|
84
187
|
{
|
|
85
188
|
name: "search_context",
|
|
86
|
-
description: "Search and return a
|
|
189
|
+
description: "Search and return a grounded context bundle for answer synthesis. Prefer this for question answering. If results are empty, check root scope, indexing completion, and environment connectivity.",
|
|
87
190
|
inputSchema: {
|
|
88
191
|
type: "object",
|
|
89
192
|
properties: {
|
|
@@ -100,9 +203,29 @@ server.setRequestHandler(ListToolsRequestSchema, async () => ({
|
|
|
100
203
|
required: ["query"]
|
|
101
204
|
}
|
|
102
205
|
},
|
|
206
|
+
{
|
|
207
|
+
name: "deep_search_context",
|
|
208
|
+
description: "Run a deeper multi-query retrieval pass for archive-style, ambiguous, or recall-heavy questions. This expands the query, searches multiple variants, diversifies sources, and returns a richer grounded bundle.",
|
|
209
|
+
inputSchema: {
|
|
210
|
+
type: "object",
|
|
211
|
+
properties: {
|
|
212
|
+
query: { type: "string", description: "Natural language question or concept to investigate." },
|
|
213
|
+
rootIds: {
|
|
214
|
+
type: "array",
|
|
215
|
+
items: { type: "string" },
|
|
216
|
+
description: "Optional list of root IDs to restrict search."
|
|
217
|
+
},
|
|
218
|
+
pathSubstring: { type: "string", description: "Optional file path substring filter." },
|
|
219
|
+
perQueryLimit: { type: "number", description: "Max hits per query variant. Default 6." },
|
|
220
|
+
finalLimit: { type: "number", description: "Max final aggregated hits. Default 10." },
|
|
221
|
+
maxChars: { type: "number", description: "Max total characters in the returned context bundle. Default 16000." }
|
|
222
|
+
},
|
|
223
|
+
required: ["query"]
|
|
224
|
+
}
|
|
225
|
+
},
|
|
103
226
|
{
|
|
104
227
|
name: "get_chunks",
|
|
105
|
-
description: "Fetch exact indexed chunks by id after
|
|
228
|
+
description: "Fetch exact indexed chunks by id after `search` or `search_context`. Use this when exact wording matters.",
|
|
106
229
|
inputSchema: {
|
|
107
230
|
type: "object",
|
|
108
231
|
properties: {
|
|
@@ -117,7 +240,7 @@ server.setRequestHandler(ListToolsRequestSchema, async () => ({
|
|
|
117
240
|
},
|
|
118
241
|
{
|
|
119
242
|
name: "list_roots",
|
|
120
|
-
description: "List
|
|
243
|
+
description: "List indexed roots with their IDs. Use this before scoped search if you only know the human-readable folder name.",
|
|
121
244
|
inputSchema: {
|
|
122
245
|
type: "object",
|
|
123
246
|
properties: {}
|
|
@@ -125,7 +248,7 @@ server.setRequestHandler(ListToolsRequestSchema, async () => ({
|
|
|
125
248
|
},
|
|
126
249
|
{
|
|
127
250
|
name: "delete_root",
|
|
128
|
-
description: "Delete one indexed root from config and vector storage.",
|
|
251
|
+
description: "Delete one indexed root from config and vector storage. This removes its indexed vectors.",
|
|
129
252
|
inputSchema: {
|
|
130
253
|
type: "object",
|
|
131
254
|
properties: {
|
|
@@ -136,7 +259,7 @@ server.setRequestHandler(ListToolsRequestSchema, async () => ({
|
|
|
136
259
|
},
|
|
137
260
|
{
|
|
138
261
|
name: "store_status",
|
|
139
|
-
description: "Inspect Qdrant collection status for
|
|
262
|
+
description: "Inspect the local Qdrant collection status. Use this for environment diagnostics when search behaves unexpectedly.",
|
|
140
263
|
inputSchema: {
|
|
141
264
|
type: "object",
|
|
142
265
|
properties: {}
|
|
@@ -147,13 +270,37 @@ server.setRequestHandler(ListToolsRequestSchema, async () => ({
|
|
|
147
270
|
server.setRequestHandler(CallToolRequestSchema, async (request) => {
|
|
148
271
|
try {
|
|
149
272
|
switch (request.params.name) {
|
|
273
|
+
case "get_retrieval_skill": {
|
|
274
|
+
const args = request.params.arguments;
|
|
275
|
+
const skillName = args?.skillName?.trim() || BUNDLED_SKILL_NAME;
|
|
276
|
+
const skill = await readBundledSkill(skillName);
|
|
277
|
+
return withGuidance(
|
|
278
|
+
`Loaded bundled retrieval skill ${skillName}.`,
|
|
279
|
+
{
|
|
280
|
+
skillName,
|
|
281
|
+
skill
|
|
282
|
+
},
|
|
283
|
+
[
|
|
284
|
+
"Use this skill to rewrite weak user requests into stronger retrieval queries.",
|
|
285
|
+
"Prefer `search_context` for grounded answering and `get_chunks` when exact wording matters."
|
|
286
|
+
]
|
|
287
|
+
);
|
|
288
|
+
}
|
|
150
289
|
case "index_path": {
|
|
151
290
|
const args = request.params.arguments;
|
|
152
291
|
if (!args?.path) {
|
|
153
292
|
throw new OwnSearchError("`path` is required.");
|
|
154
293
|
}
|
|
155
294
|
const result = await indexPath(args.path, { name: args.name });
|
|
156
|
-
return
|
|
295
|
+
return withGuidance(
|
|
296
|
+
`Indexed folder ${args.path}.`,
|
|
297
|
+
result,
|
|
298
|
+
[
|
|
299
|
+
`Call \`get_retrieval_skill\` once if you want explicit OwnSearch query-planning guidance.`,
|
|
300
|
+
"Use `list_roots` to confirm the registered root ID if you need scoped search.",
|
|
301
|
+
"Then call `search_context` for grounded retrieval or `search` for ranked hits."
|
|
302
|
+
]
|
|
303
|
+
);
|
|
157
304
|
}
|
|
158
305
|
case "search": {
|
|
159
306
|
const args = request.params.arguments;
|
|
@@ -171,10 +318,69 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
|
|
|
171
318
|
},
|
|
172
319
|
Math.max(1, Math.min(args.limit ?? 5, 20))
|
|
173
320
|
);
|
|
174
|
-
|
|
321
|
+
if (hits.length === 0) {
|
|
322
|
+
return withGuidance(
|
|
323
|
+
"Search completed but returned no results.",
|
|
324
|
+
{
|
|
325
|
+
query: args.query,
|
|
326
|
+
hits
|
|
327
|
+
},
|
|
328
|
+
[
|
|
329
|
+
"If you intended to search one indexed folder, call `list_roots` and confirm the correct `rootIds` value.",
|
|
330
|
+
"If indexing may have been interrupted, call `index_path` again for that folder.",
|
|
331
|
+
"If this server is running in a restricted environment and earlier calls showed `fetch failed`, verify Gemini and Qdrant connectivity outside the sandbox."
|
|
332
|
+
]
|
|
333
|
+
);
|
|
334
|
+
}
|
|
335
|
+
return withGuidance(
|
|
336
|
+
`Search returned ${hits.length} hit(s).`,
|
|
337
|
+
{
|
|
338
|
+
query: args.query,
|
|
339
|
+
hits
|
|
340
|
+
},
|
|
341
|
+
[
|
|
342
|
+
"Use `literal_search` instead when the user gives strong exact strings, IDs, names, or titles.",
|
|
343
|
+
"If you have not read the OwnSearch retrieval guidance yet, call `get_retrieval_skill` first.",
|
|
344
|
+
"Use `search_context` if you want a compact grounded bundle for answering.",
|
|
345
|
+
"Use `get_chunks` on selected hit IDs when exact wording matters."
|
|
346
|
+
]
|
|
347
|
+
);
|
|
348
|
+
}
|
|
349
|
+
case "literal_search": {
|
|
350
|
+
const args = request.params.arguments;
|
|
351
|
+
if (!args?.query) {
|
|
352
|
+
throw new OwnSearchError("`query` is required.");
|
|
353
|
+
}
|
|
354
|
+
const matches = await literalSearch({
|
|
175
355
|
query: args.query,
|
|
176
|
-
|
|
356
|
+
rootIds: args.rootIds,
|
|
357
|
+
pathSubstring: args.pathSubstring,
|
|
358
|
+
limit: args.limit
|
|
177
359
|
});
|
|
360
|
+
if (matches.length === 0) {
|
|
361
|
+
return withGuidance(
|
|
362
|
+
"Literal search completed but returned no exact matches.",
|
|
363
|
+
{
|
|
364
|
+
query: args.query,
|
|
365
|
+
matches
|
|
366
|
+
},
|
|
367
|
+
[
|
|
368
|
+
"If the user request is more conceptual or paraphrased, switch to `search_context` or `deep_search_context`.",
|
|
369
|
+
"If you expected a scoped result, call `list_roots` and verify the correct root ID."
|
|
370
|
+
]
|
|
371
|
+
);
|
|
372
|
+
}
|
|
373
|
+
return withGuidance(
|
|
374
|
+
`Literal search returned ${matches.length} exact match(es).`,
|
|
375
|
+
{
|
|
376
|
+
query: args.query,
|
|
377
|
+
matches
|
|
378
|
+
},
|
|
379
|
+
[
|
|
380
|
+
"Use these results when exact wording, names, IDs, or titles matter.",
|
|
381
|
+
"Switch to `search_context` or `deep_search_context` if you need semantic expansion or multi-document synthesis."
|
|
382
|
+
]
|
|
383
|
+
);
|
|
178
384
|
}
|
|
179
385
|
case "search_context": {
|
|
180
386
|
const args = request.params.arguments;
|
|
@@ -192,7 +398,65 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
|
|
|
192
398
|
},
|
|
193
399
|
Math.max(1, Math.min(args.limit ?? 8, 20))
|
|
194
400
|
);
|
|
195
|
-
|
|
401
|
+
if (hits.length === 0) {
|
|
402
|
+
return withGuidance(
|
|
403
|
+
"Context search completed but returned no results.",
|
|
404
|
+
{
|
|
405
|
+
query: args.query,
|
|
406
|
+
totalChars: 0,
|
|
407
|
+
results: []
|
|
408
|
+
},
|
|
409
|
+
[
|
|
410
|
+
"Call `list_roots` to confirm the target root ID.",
|
|
411
|
+
"Retry `search` with the same query to inspect raw hits.",
|
|
412
|
+
"If indexing may not have completed, call `index_path` again for the folder."
|
|
413
|
+
]
|
|
414
|
+
);
|
|
415
|
+
}
|
|
416
|
+
const bundle = buildContextBundle(args.query, hits, Math.max(500, args.maxChars ?? 12e3));
|
|
417
|
+
return withGuidance(
|
|
418
|
+
`Context bundle built from ${bundle.results.length} result block(s).`,
|
|
419
|
+
bundle,
|
|
420
|
+
[
|
|
421
|
+
"Use `literal_search` first when the query contains a strong exact string or title.",
|
|
422
|
+
"If retrieval planning is weak or ambiguous, call `get_retrieval_skill` for query-rewrite guidance.",
|
|
423
|
+
"Answer using only the returned context when possible.",
|
|
424
|
+
"If you need exact source text, call `get_chunks` with the contributing chunk IDs from `search`."
|
|
425
|
+
]
|
|
426
|
+
);
|
|
427
|
+
}
|
|
428
|
+
case "deep_search_context": {
|
|
429
|
+
const args = request.params.arguments;
|
|
430
|
+
if (!args?.query) {
|
|
431
|
+
throw new OwnSearchError("`query` is required.");
|
|
432
|
+
}
|
|
433
|
+
const result = await deepSearchContext(args.query, {
|
|
434
|
+
rootIds: args.rootIds,
|
|
435
|
+
pathSubstring: args.pathSubstring,
|
|
436
|
+
perQueryLimit: args.perQueryLimit,
|
|
437
|
+
finalLimit: args.finalLimit,
|
|
438
|
+
maxChars: args.maxChars
|
|
439
|
+
});
|
|
440
|
+
if (result.bundle.results.length === 0) {
|
|
441
|
+
return withGuidance(
|
|
442
|
+
"Deep retrieval completed but still found no grounded evidence.",
|
|
443
|
+
result,
|
|
444
|
+
[
|
|
445
|
+
"Call `list_roots` to confirm the root scope.",
|
|
446
|
+
"Retry with a shorter or more literal query.",
|
|
447
|
+
"If the corpus was indexed recently, call `index_path` again to ensure indexing completed."
|
|
448
|
+
]
|
|
449
|
+
);
|
|
450
|
+
}
|
|
451
|
+
return withGuidance(
|
|
452
|
+
`Deep retrieval built a richer bundle from ${result.distinctFiles} distinct file(s) across ${result.queryVariants.length} query variant(s).`,
|
|
453
|
+
result,
|
|
454
|
+
[
|
|
455
|
+
"Use `literal_search` instead when the user gives a precise title, error string, or identifier.",
|
|
456
|
+
"Use this result for archive-style or multi-document synthesis.",
|
|
457
|
+
"If you need exact wording, follow up with `search` and `get_chunks` on the strongest source files."
|
|
458
|
+
]
|
|
459
|
+
);
|
|
196
460
|
}
|
|
197
461
|
case "get_chunks": {
|
|
198
462
|
const args = request.params.arguments;
|
|
@@ -201,11 +465,19 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
|
|
|
201
465
|
}
|
|
202
466
|
const store = await createStore();
|
|
203
467
|
const chunks = await store.getChunks(args.ids);
|
|
204
|
-
return
|
|
468
|
+
return withGuidance(
|
|
469
|
+
`Fetched ${chunks.length} chunk(s).`,
|
|
470
|
+
{ chunks },
|
|
471
|
+
chunks.length ? ["Use these exact chunks when precise quoting or comparison matters."] : ["No matching chunk IDs were found. Re-run `search` and use returned hit IDs."]
|
|
472
|
+
);
|
|
205
473
|
}
|
|
206
474
|
case "list_roots": {
|
|
207
475
|
const config = await loadConfig();
|
|
208
|
-
return
|
|
476
|
+
return withGuidance(
|
|
477
|
+
`Found ${config.roots.length} indexed root(s).`,
|
|
478
|
+
{ roots: config.roots },
|
|
479
|
+
config.roots.length ? ["Use the returned `id` values in `search` or `search_context` when you want scoped retrieval."] : ["No roots are indexed yet. Call `index_path` on a local folder first."]
|
|
480
|
+
);
|
|
209
481
|
}
|
|
210
482
|
case "delete_root": {
|
|
211
483
|
const args = request.params.arguments;
|
|
@@ -219,14 +491,26 @@ server.setRequestHandler(CallToolRequestSchema, async (request) => {
|
|
|
219
491
|
const store = await createStore();
|
|
220
492
|
await store.deleteRoot(root.id);
|
|
221
493
|
await deleteRootDefinition(root.id);
|
|
222
|
-
return
|
|
223
|
-
|
|
224
|
-
|
|
225
|
-
|
|
494
|
+
return withGuidance(
|
|
495
|
+
`Deleted root ${root.id}.`,
|
|
496
|
+
{
|
|
497
|
+
deleted: true,
|
|
498
|
+
root
|
|
499
|
+
},
|
|
500
|
+
["Call `list_roots` to confirm the remaining indexed roots."]
|
|
501
|
+
);
|
|
226
502
|
}
|
|
227
503
|
case "store_status": {
|
|
228
504
|
const store = await createStore();
|
|
229
|
-
|
|
505
|
+
const status = await store.getStatus();
|
|
506
|
+
return withGuidance(
|
|
507
|
+
"Retrieved vector store status.",
|
|
508
|
+
status,
|
|
509
|
+
[
|
|
510
|
+
"If search fails, check `pointsCount`, `indexedVectorsCount`, and collection status here.",
|
|
511
|
+
"Run `list_roots` next if you need to scope searches by root."
|
|
512
|
+
]
|
|
513
|
+
);
|
|
230
514
|
}
|
|
231
515
|
default:
|
|
232
516
|
throw new OwnSearchError(`Unknown tool: ${request.params.name}`);
|
package/package.json
CHANGED
|
@@ -14,9 +14,10 @@ Use this skill to bridge the gap between what a user asks and what OwnSearch sho
|
|
|
14
14
|
1. Classify the user request.
|
|
15
15
|
2. Generate one to four retrieval queries.
|
|
16
16
|
3. Start with `search_context` for the strongest query.
|
|
17
|
-
4.
|
|
18
|
-
5.
|
|
19
|
-
6.
|
|
17
|
+
4. Use `deep_search_context` for archive-style, ambiguous, or recall-heavy questions.
|
|
18
|
+
5. Expand to additional searches only if evidence is weak, duplicate-heavy, or incomplete.
|
|
19
|
+
6. Use `get_chunks` after `search` when the answer needs exact wording, detailed comparison, or citation-grade grounding.
|
|
20
|
+
7. Answer only from retrieved evidence. Say when the retrieved context is insufficient.
|
|
20
21
|
|
|
21
22
|
## Query Planning
|
|
22
23
|
|
|
@@ -55,6 +56,19 @@ Use `search_context` when:
|
|
|
55
56
|
- the answer can be supported by a few chunks
|
|
56
57
|
- low latency matters more than exhaustive recall
|
|
57
58
|
|
|
59
|
+
Use `literal_search` when:
|
|
60
|
+
|
|
61
|
+
- the user gives an exact title, name, identifier, error string, or quoted phrase
|
|
62
|
+
- you want grep-style lookup before semantic expansion
|
|
63
|
+
- you suspect the right answer is present literally and want to avoid semantic drift
|
|
64
|
+
|
|
65
|
+
Use `deep_search_context` when:
|
|
66
|
+
|
|
67
|
+
- the question spans multiple documents or timelines
|
|
68
|
+
- the answer is likely to require recall beyond the top few semantic hits
|
|
69
|
+
- the user asks "what is", "what happened", or "tell me the full story" for an entity or event
|
|
70
|
+
- the first-pass `search_context` result feels too thin
|
|
71
|
+
|
|
58
72
|
Use `search` when:
|
|
59
73
|
|
|
60
74
|
- you want to inspect ranking and source distribution
|
|
@@ -124,6 +138,19 @@ For a normal grounded answer:
|
|
|
124
138
|
4. If results look weak or ambiguous, call `search` with another variant.
|
|
125
139
|
5. Fetch exact chunks for the best IDs before making precise claims.
|
|
126
140
|
|
|
141
|
+
For an exact-string lookup:
|
|
142
|
+
|
|
143
|
+
1. Start with `literal_search`.
|
|
144
|
+
2. If literal hits are enough, answer from them or fetch exact chunks nearby.
|
|
145
|
+
3. If literal hits are sparse or too narrow, switch to `search_context`.
|
|
146
|
+
|
|
147
|
+
For an archive-style or lore-style question:
|
|
148
|
+
|
|
149
|
+
1. Start with `deep_search_context`.
|
|
150
|
+
2. Inspect the query variants and source spread.
|
|
151
|
+
3. If the answer depends on exact chronology or wording, follow with `search`.
|
|
152
|
+
4. Fetch exact chunks from the strongest files before making strong claims.
|
|
153
|
+
|
|
127
154
|
For a locate-the-source task:
|
|
128
155
|
|
|
129
156
|
1. Use `search` first.
|