qmdr 1.0.0 → 1.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (70) hide show
  1. package/AI-SETUP.md +61 -98
  2. package/docs/setup-openclaw.md +20 -3
  3. package/package.json +1 -1
  4. package/qmd-wrapper-backup +12 -0
  5. package/src/llm.ts +28 -1
  6. package/src/qmd.ts +29 -2
  7. package/.github/workflows/release.yml +0 -77
  8. package/finetune/BALANCED_DISTRIBUTION.md +0 -157
  9. package/finetune/DATA_IMPROVEMENTS.md +0 -218
  10. package/finetune/Justfile +0 -43
  11. package/finetune/Modelfile +0 -16
  12. package/finetune/README.md +0 -299
  13. package/finetune/SCORING.md +0 -286
  14. package/finetune/configs/accelerate_multi_gpu.yaml +0 -17
  15. package/finetune/configs/grpo.yaml +0 -49
  16. package/finetune/configs/sft.yaml +0 -42
  17. package/finetune/configs/sft_local.yaml +0 -40
  18. package/finetune/convert_gguf.py +0 -221
  19. package/finetune/data/best_glm_prompt.txt +0 -17
  20. package/finetune/data/gepa_generated.prompts.json +0 -32
  21. package/finetune/data/qmd_expansion_balanced_deduped.jsonl +0 -413
  22. package/finetune/data/qmd_expansion_diverse_addon.jsonl +0 -386
  23. package/finetune/data/qmd_expansion_handcrafted.jsonl +0 -65
  24. package/finetune/data/qmd_expansion_handcrafted_only.jsonl +0 -336
  25. package/finetune/data/qmd_expansion_locations.jsonl +0 -64
  26. package/finetune/data/qmd_expansion_people.jsonl +0 -46
  27. package/finetune/data/qmd_expansion_short_nontech.jsonl +0 -200
  28. package/finetune/data/qmd_expansion_v2.jsonl +0 -1498
  29. package/finetune/data/qmd_only_sampled.jsonl +0 -399
  30. package/finetune/dataset/analyze_data.py +0 -369
  31. package/finetune/dataset/clean_data.py +0 -906
  32. package/finetune/dataset/generate_balanced.py +0 -823
  33. package/finetune/dataset/generate_data.py +0 -714
  34. package/finetune/dataset/generate_data_offline.py +0 -206
  35. package/finetune/dataset/generate_diverse.py +0 -441
  36. package/finetune/dataset/generate_ollama.py +0 -326
  37. package/finetune/dataset/prepare_data.py +0 -197
  38. package/finetune/dataset/schema.py +0 -73
  39. package/finetune/dataset/score_data.py +0 -115
  40. package/finetune/dataset/validate_schema.py +0 -104
  41. package/finetune/eval.py +0 -196
  42. package/finetune/evals/queries.txt +0 -56
  43. package/finetune/gepa/__init__.py +0 -1
  44. package/finetune/gepa/best_prompt.txt +0 -31
  45. package/finetune/gepa/best_prompt_glm.txt +0 -1
  46. package/finetune/gepa/dspy_gepa.py +0 -204
  47. package/finetune/gepa/example.py +0 -117
  48. package/finetune/gepa/generate.py +0 -129
  49. package/finetune/gepa/gepa_outputs.jsonl +0 -10
  50. package/finetune/gepa/gepa_outputs_glm.jsonl +0 -20
  51. package/finetune/gepa/model.json +0 -19
  52. package/finetune/gepa/optimizer.py +0 -70
  53. package/finetune/gepa/score.py +0 -84
  54. package/finetune/jobs/eval.py +0 -490
  55. package/finetune/jobs/eval_common.py +0 -354
  56. package/finetune/jobs/eval_verbose.py +0 -113
  57. package/finetune/jobs/grpo.py +0 -141
  58. package/finetune/jobs/quantize.py +0 -244
  59. package/finetune/jobs/sft.py +0 -121
  60. package/finetune/pyproject.toml +0 -23
  61. package/finetune/reward.py +0 -610
  62. package/finetune/train.py +0 -611
  63. package/finetune/uv.lock +0 -4070
  64. package/src/cli.test.ts +0 -963
  65. package/src/eval.test.ts +0 -412
  66. package/src/llm.test.ts +0 -559
  67. package/src/mcp.test.ts +0 -889
  68. package/src/store-paths.test.ts +0 -395
  69. package/src/store.test.ts +0 -2483
  70. package/test/eval-harness.ts +0 -223
package/AI-SETUP.md CHANGED
@@ -112,9 +112,16 @@ chmod +x /usr/local/bin/qmd
112
112
 
113
113
  ---
114
114
 
115
- ## Phase 3: Configure Embedding (SiliconFlow)
115
+ ## Phase 3: Configure API Keys
116
116
 
117
- SiliconFlow provides the embedding model. Ask:
117
+ All configuration goes into one file: `~/.config/qmd/.env`
118
+
119
+ QMDR automatically reads this file on startup (both Bun and binary installs). Create it:
120
+
121
+ ```bash
122
+ mkdir -p ~/.config/qmd
123
+ nano ~/.config/qmd/.env # or any editor
124
+ ```
118
125
 
119
126
  > **Do you have a SiliconFlow API key?**
120
127
  >
@@ -124,8 +131,9 @@ SiliconFlow provides the embedding model. Ask:
124
131
 
125
132
  ### Option 1: User has key
126
133
 
127
- ```bash
128
- export QMD_SILICONFLOW_API_KEY=sk-user-provided-key
134
+ Write to `~/.config/qmd/.env`:
135
+ ```
136
+ QMD_SILICONFLOW_API_KEY=sk-user-provided-key
129
137
  ```
130
138
 
131
139
  ### Option 2: Guide to get key
@@ -142,76 +150,45 @@ export QMD_SILICONFLOW_API_KEY=sk-user-provided-key
142
150
  > 2. Go to API Keys page → Create new key
143
151
  > 3. Free tier includes embedding models
144
152
 
145
- After user provides key:
146
- ```bash
147
- export QMD_SILICONFLOW_API_KEY=sk-user-provided-key
153
+ After user provides key, write to `~/.config/qmd/.env`:
154
+ ```
155
+ QMD_SILICONFLOW_API_KEY=sk-user-provided-key
148
156
  ```
149
157
 
150
- **International users — additional API endpoint config:**
151
- ```bash
152
- export QMD_SILICONFLOW_BASE_URL=https://api.siliconflow.com/v1
158
+ **International users — add this line too:**
159
+ ```
160
+ QMD_SILICONFLOW_BASE_URL=https://api.siliconflow.com/v1
153
161
  ```
154
162
  (China users don't need this — the default `https://api.siliconflow.cn/v1` works)
155
163
 
156
164
  ### Option 3: Custom provider
157
165
 
158
- Ask for:
159
- 1. API endpoint URL (must be OpenAI-compatible)
160
- 2. API key
161
-
162
- ```bash
163
- export QMD_OPENAI_API_KEY=user-key
164
- export QMD_OPENAI_BASE_URL=https://their-endpoint.com/v1
165
- export QMD_EMBED_PROVIDER=openai
166
+ Ask for endpoint URL and API key, then write to `~/.config/qmd/.env`:
166
167
  ```
167
-
168
- Test connectivity:
169
- ```bash
170
- curl -s "$QMD_OPENAI_BASE_URL/models" -H "Authorization: Bearer $QMD_OPENAI_API_KEY" | head -5
168
+ QMD_OPENAI_API_KEY=user-key
169
+ QMD_OPENAI_BASE_URL=https://their-endpoint.com/v1
170
+ QMD_EMBED_PROVIDER=openai
171
171
  ```
172
172
 
173
173
  ### Configure embedding model
174
174
 
175
175
  Default: `Qwen/Qwen3-Embedding-8B` (on SiliconFlow, free)
176
176
 
177
- Ask:
178
- > Keep the default embedding model (`Qwen/Qwen3-Embedding-8B`), or choose your own?
179
- >
180
- > If you don't know which to pick, the default is good.
177
+ > Keep the default embedding model, or choose your own?
181
178
 
182
- If user wants to change:
183
- ```bash
184
- export QMD_SILICONFLOW_EMBED_MODEL=their-chosen-model
179
+ If user wants to change, add to `.env`:
180
+ ```
181
+ QMD_SILICONFLOW_EMBED_MODEL=their-chosen-model
185
182
  # or for custom provider:
186
- export QMD_OPENAI_EMBED_MODEL=their-chosen-model
183
+ QMD_OPENAI_EMBED_MODEL=their-chosen-model
187
184
  ```
188
185
 
189
- ### Configure embedding dimensions
190
-
191
- > Do you know the output dimensions of your embedding model?
192
- >
193
- > 1. **Use default** (auto-detect from model — recommended)
194
- > 2. **I know the dimensions** — let me enter it
195
- > 3. **I don't know** — please look it up for me
196
-
197
- If option 3: search the web for "{model_name} embedding dimensions" and inform the user.
198
-
199
- Note: If dimensions change after initial indexing, user must run `qmd embed -f` to rebuild.
200
-
201
186
  ### Configure chunk size
202
187
 
203
- Default: `200` tokens per chunk, `40` tokens overlap.
204
-
205
- > Chunk size controls how documents are split for embedding.
206
- >
207
- > - **Default: 200 tokens** (recommended for most use cases)
208
- > - Larger chunks = more context per result, but less precise
209
- > - Smaller chunks = more precise, but may lose context
210
-
211
- ```bash
212
- # Only if user wants to change defaults:
213
- export QMD_CHUNK_SIZE_TOKENS=200 # default
214
- export QMD_CHUNK_OVERLAP_TOKENS=40 # default
188
+ Default: `200` tokens per chunk, `40` tokens overlap. Only add to `.env` if user wants to change:
189
+ ```
190
+ QMD_CHUNK_SIZE_TOKENS=200
191
+ QMD_CHUNK_OVERLAP_TOKENS=40
215
192
  ```
216
193
 
217
194
  ---
@@ -222,22 +199,19 @@ Query expansion rewrites the user's search query into multiple variations (keywo
222
199
 
223
200
  Default: `GLM-4.5-Air` on SiliconFlow (~¥1/month, fast, good quality).
224
201
 
225
- Ask:
226
- > Use the default query expansion model (`GLM-4.5-Air` on SiliconFlow), or use your own?
202
+ > Use the default query expansion model, or use your own?
227
203
  >
228
204
  > 1. **Default** (GLM-4.5-Air on SiliconFlow — recommended)
229
205
  > 2. **Use the reranking provider's model** (configured in next step)
230
206
  > 3. **Custom** — I want to specify a model
231
207
 
232
- If option 2: will be configured after Phase 5.
233
-
234
- If option 3:
235
- ```bash
236
- export QMD_QUERY_EXPANSION_PROVIDER=openai # or gemini
237
- # For OpenAI-compatible:
238
- export QMD_OPENAI_MODEL=their-model-name
239
- # For Gemini:
240
- export QMD_GEMINI_API_KEY=their-key
208
+ If option 3, add to `~/.config/qmd/.env`:
209
+ ```
210
+ QMD_QUERY_EXPANSION_PROVIDER=openai
211
+ QMD_OPENAI_MODEL=their-model-name
212
+ # Or for Gemini:
213
+ QMD_QUERY_EXPANSION_PROVIDER=gemini
214
+ QMD_GEMINI_API_KEY=their-key
241
215
  ```
242
216
 
243
217
  ---
@@ -256,9 +230,10 @@ Reranking uses a large language model to judge which search results are truly re
256
230
 
257
231
  ### Option 1: User has Gemini key
258
232
 
259
- ```bash
260
- export QMD_GEMINI_API_KEY=user-key
261
- export QMD_RERANK_PROVIDER=gemini
233
+ Add to `~/.config/qmd/.env`:
234
+ ```
235
+ QMD_GEMINI_API_KEY=user-key
236
+ QMD_RERANK_PROVIDER=gemini
262
237
  ```
263
238
 
264
239
  ### Option 2: Guide to get Gemini key
@@ -268,48 +243,34 @@ export QMD_RERANK_PROVIDER=gemini
268
243
  > 2. Click "Get API key" → Create key
269
244
  > 3. Free tier: 15 RPM / 1M tokens per day (more than enough for reranking)
270
245
 
271
- Note for China users: Gemini API may require a proxy. If the user has a proxy:
272
- ```bash
273
- export QMD_GEMINI_BASE_URL=https://their-proxy-endpoint
246
+ Note for China users: Gemini API may require a proxy. Add to `.env`:
247
+ ```
248
+ QMD_GEMINI_BASE_URL=https://their-proxy-endpoint
274
249
  ```
275
250
 
276
251
  If no proxy available, recommend Option 3 instead.
277
252
 
278
253
  ### Option 3: Alternative reranking
279
254
 
255
+ Add to `~/.config/qmd/.env`:
256
+
280
257
  **Using SiliconFlow LLM rerank (no extra key needed):**
281
- ```bash
282
- export QMD_RERANK_PROVIDER=siliconflow
283
- export QMD_RERANK_MODE=llm
258
+ ```
259
+ QMD_RERANK_PROVIDER=siliconflow
260
+ QMD_RERANK_MODE=llm
284
261
  ```
285
262
 
286
- **Using a dedicated rerank model API (e.g. BAAI/bge-reranker):**
287
- ```bash
288
- export QMD_RERANK_PROVIDER=siliconflow
289
- export QMD_RERANK_MODE=rerank
290
- export QMD_SILICONFLOW_RERANK_MODEL=BAAI/bge-reranker-v2-m3
263
+ **Using a dedicated rerank model API:**
264
+ ```
265
+ QMD_RERANK_PROVIDER=siliconflow
266
+ QMD_RERANK_MODE=rerank
267
+ QMD_SILICONFLOW_RERANK_MODEL=BAAI/bge-reranker-v2-m3
291
268
  ```
292
269
 
293
270
  **Using OpenAI-compatible endpoint:**
294
- ```bash
295
- export QMD_RERANK_PROVIDER=openai
296
- export QMD_RERANK_MODE=llm
297
- export QMD_OPENAI_API_KEY=their-key
298
- export QMD_OPENAI_BASE_URL=https://their-endpoint/v1
299
271
  ```
300
-
301
- For Claude Code / OpenCode users: you can reuse whichever model API you're already paying for.
302
-
303
- ### Model selection
304
-
305
- Default reranking model: `gemini-2.5-flash` with `thinkingBudget: 0` (no reasoning overhead, pure relevance judgment).
306
-
307
- > Keep the default reranking model (`gemini-2.5-flash`), or change it?
308
-
309
- If user wants to change:
310
- ```bash
311
- export QMD_GEMINI_MODEL=their-model # for Gemini provider
312
- export QMD_LLM_RERANK_MODEL=their-model # for SiliconFlow/OpenAI LLM rerank
272
+ QMD_RERANK_PROVIDER=openai
273
+ QMD_RERANK_MODE=llm
313
274
  ```
314
275
 
315
276
  ---
@@ -414,7 +375,7 @@ After setup, add this to `AGENTS.md` in the project root:
414
375
  ## Environment Variables Reference
415
376
 
416
377
  | Variable | Default | Purpose |
417
- |----------|---------|---------||
378
+ |----------|---------|---------|
418
379
  | **API Keys** | | |
419
380
  | `QMD_SILICONFLOW_API_KEY` | — | SiliconFlow |
420
381
  | `QMD_GEMINI_API_KEY` | — | Google Gemini |
@@ -443,6 +404,8 @@ After setup, add this to `AGENTS.md` in the project root:
443
404
  | `QMD_RERANK_DOC_LIMIT` | `40` | Max docs for reranking |
444
405
  | `QMD_RERANK_CHUNKS_PER_DOC` | `3` | Chunks per doc for reranking |
445
406
  | **Paths** | | |
407
+ | `QMD_CONFIG_DIR` | `~/.config/qmd` | Config directory (index.yml + .env location) |
408
+ | `XDG_CACHE_HOME` | `~/.cache` | Cache directory (database at `$XDG_CACHE_HOME/qmd/index.sqlite`) |
446
409
  | `QMD_SQLITE_VEC_PATH` | auto | sqlite-vec native extension path |
447
410
 
448
411
  **If you change the embedding model, run `qmd embed -f` to rebuild the vector index.**
@@ -79,7 +79,11 @@ Edit `~/.openclaw/openclaw.json` — add/merge the `memory` block:
79
79
 
80
80
  API keys must be available to the OpenClaw process, not just your shell.
81
81
 
82
- **macOS (launchd):**
82
+ **Option A: Use `~/.config/qmd/.env` (recommended — works for all install types)**
83
+
84
+ QMDR auto-loads this file on startup. If you already configured it in Phase 3-5, you're done — OpenClaw will inherit these settings automatically.
85
+
86
+ **Option B: macOS (launchd) — set in the service plist:**
83
87
  Add to `~/Library/LaunchAgents/ai.openclaw.gateway.plist` under `EnvironmentVariables`:
84
88
  ```xml
85
89
  <key>QMD_SILICONFLOW_API_KEY</key>
@@ -89,7 +93,7 @@ Add to `~/Library/LaunchAgents/ai.openclaw.gateway.plist` under `EnvironmentVari
89
93
  ```
90
94
  Then reload: `launchctl unload ~/Library/LaunchAgents/ai.openclaw.gateway.plist && launchctl load ~/Library/LaunchAgents/ai.openclaw.gateway.plist`
91
95
 
92
- **Linux (systemd):**
96
+ **Option C: Linux (systemd) — set in the service unit:**
93
97
  Add to the service unit under `[Service]`:
94
98
  ```ini
95
99
  Environment=QMD_SILICONFLOW_API_KEY=sk-your-key
@@ -97,13 +101,26 @@ Environment=QMD_GEMINI_API_KEY=your-gemini-key
97
101
  ```
98
102
  Then: `systemctl --user daemon-reload && systemctl --user restart openclaw`
99
103
 
104
+ **Priority:** System/process env vars > `~/.config/qmd/.env` (env vars override .env if both are set).
105
+
100
106
  ## Step 5: Restart and verify
101
107
 
108
+ Before restarting, confirm:
109
+ 1. `qmd doctor` passes with no errors ✅
110
+ 2. `qmd query "test"` returns results ✅
111
+ 3. API keys are in launchd/systemd env (Step 4) ✅
112
+
102
113
  ```bash
103
114
  openclaw gateway restart
115
+
116
+ # Or if using OpenClaw CLI:
117
+ openclaw gateway stop && openclaw gateway start
104
118
  ```
105
119
 
106
- Then test by asking your bot about past conversations, or check logs for `"backend": "qmd"`.
120
+ After restart, verify QMDR is active:
121
+ - Ask your bot about something from past conversations
122
+ - Check that `memory_search` results show `"provider": "qmd"` (not `"sqlite"`)
123
+ - If results seem to fall back to basic search, check `qmd doctor` inside the OpenClaw process environment
107
124
 
108
125
  ## Step 6: Add usage tips to TOOLS.md
109
126
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "qmdr",
3
- "version": "1.0.0",
3
+ "version": "1.0.2",
4
4
  "description": "Remote-first CLI search engine for markdown docs, notes, and knowledge bases",
5
5
  "type": "module",
6
6
  "bin": {
@@ -0,0 +1,12 @@
1
+ #!/bin/bash
2
+ # QMD wrapper - runs via bun instead of compiled binary
3
+ # When --json flag is present, redirect stderr separately to avoid breaking JSON output
4
+ # The debug console.log calls in qmd go to stdout; we need to filter them
5
+ if echo "$@" | grep -q '\-\-json'; then
6
+ # Run and extract only the JSON array from stdout
7
+ output=$(bun run /Users/fu/clawd/qmd/src/qmd.ts "$@" 2>/dev/null)
8
+ # Extract JSON: find the line starting with [ and everything after
9
+ echo "$output" | sed -n '/^\[/,$ p'
10
+ else
11
+ exec bun run /Users/fu/clawd/qmd/src/qmd.ts "$@"
12
+ fi
package/src/llm.ts CHANGED
@@ -26,7 +26,16 @@ type LlamaModel = any;
26
26
  type LlamaEmbeddingContext = any;
27
27
  type LlamaToken = any;
28
28
  let LlamaChatSession: any = null;
29
- getLlamaCpp().then(m => { LlamaChatSession = m.LlamaChatSession; }).catch(() => {});
29
+ // Lazy-init LlamaChatSession only when actually needed (avoid eager import crash on Linux CI)
30
+ async function ensureLlamaChatSession() {
31
+ if (!LlamaChatSession) {
32
+ try {
33
+ const m = await getLlamaCpp();
34
+ LlamaChatSession = m.LlamaChatSession;
35
+ } catch {}
36
+ }
37
+ return LlamaChatSession;
38
+ }
30
39
  import { homedir } from "os";
31
40
  import { join } from "path";
32
41
  import { existsSync, mkdirSync, statSync, unlinkSync, readdirSync, readFileSync, writeFileSync } from "fs";
@@ -964,6 +973,7 @@ export class LlamaCpp implements LLM {
964
973
 
965
974
  export type RemoteLLMConfig = {
966
975
  rerankProvider: 'siliconflow' | 'gemini' | 'openai';
976
+ rerankMode?: 'llm' | 'rerank'; // 'llm' = chat model, 'rerank' = dedicated rerank API
967
977
  embedProvider?: 'siliconflow' | 'openai'; // remote embedding provider (optional)
968
978
  queryExpansionProvider?: 'siliconflow' | 'gemini' | 'openai'; // remote query expansion (optional)
969
979
  siliconflow?: {
@@ -1040,6 +1050,23 @@ export class RemoteLLM implements LLM {
1040
1050
  options: RerankOptions = {}
1041
1051
  ): Promise<RerankResult> {
1042
1052
  if (this.config.rerankProvider === 'siliconflow') {
1053
+ // LLM mode: use SiliconFlow's OpenAI-compatible chat API for reranking
1054
+ if (this.config.rerankMode === 'llm') {
1055
+ // Build a temporary openai-like config from siliconflow settings
1056
+ const sf = this.config.siliconflow;
1057
+ if (!sf?.apiKey) throw new Error("SiliconFlow API key required for LLM rerank");
1058
+ const savedOpenai = this.config.openai;
1059
+ this.config.openai = {
1060
+ apiKey: sf.apiKey,
1061
+ baseUrl: (sf.baseUrl || "https://api.siliconflow.cn/v1").replace(/\/$/, ""),
1062
+ model: sf.queryExpansionModel || "zai-org/GLM-4.5-Air",
1063
+ };
1064
+ try {
1065
+ return await this.rerankWithOpenAI(query, documents, options);
1066
+ } finally {
1067
+ this.config.openai = savedOpenai;
1068
+ }
1069
+ }
1043
1070
  return this.rerankWithSiliconflow(query, documents, options);
1044
1071
  }
1045
1072
  if (this.config.rerankProvider === 'openai') {
package/src/qmd.ts CHANGED
@@ -90,6 +90,31 @@ import { handleCollectionCommand } from "./app/commands/collection.js";
90
90
  import { handleSearchCommand, handleVSearchCommand, handleQueryCommand } from "./app/commands/search.js";
91
91
  import { handleCleanupCommand, handlePullCommand, handleStatusCommand, handleUpdateCommand, handleEmbedCommand, handleMcpCommand, handleDoctorCommand } from "./app/commands/maintenance.js";
92
92
  import { createLLMService } from "./app/services/llm-service.js";
93
+ import { existsSync } from "fs";
94
+ import { join } from "path";
95
+
96
+ // =============================================================================
97
+ // Load config from ~/.config/qmd/.env (single source of truth)
98
+ // Environment variables set externally take priority (won't be overwritten)
99
+ // =============================================================================
100
+
101
+ const qmdConfigDir = process.env.QMD_CONFIG_DIR || join(homedir(), ".config", "qmd");
102
+ const qmdEnvPath = join(qmdConfigDir, ".env");
103
+ if (existsSync(qmdEnvPath)) {
104
+ const envContent = readFileSync(qmdEnvPath, "utf-8");
105
+ for (const line of envContent.split("\n")) {
106
+ const trimmed = line.trim();
107
+ if (!trimmed || trimmed.startsWith("#")) continue;
108
+ const eqIdx = trimmed.indexOf("=");
109
+ if (eqIdx === -1) continue;
110
+ const key = trimmed.slice(0, eqIdx).trim();
111
+ const val = trimmed.slice(eqIdx + 1).trim().replace(/^["']|["']$/g, "");
112
+ // Don't override existing env vars (system/process env takes priority)
113
+ if (!process.env[key]) {
114
+ process.env[key] = val;
115
+ }
116
+ }
117
+ }
93
118
 
94
119
  // Enable production mode - allows using default database path
95
120
  // Tests must set INDEX_PATH or use createStore() with explicit path
@@ -270,9 +295,10 @@ function getRemoteLLM(): RemoteLLM | null {
270
295
  if (rerankProvider === 'gemini' || rerankProvider === 'openai') {
271
296
  effectiveRerankProvider = rerankProvider;
272
297
  } else if (rerankProvider === 'siliconflow') {
273
- effectiveRerankProvider = sfApiKey ? 'openai' : undefined;
298
+ // LLM rerank via SiliconFlow's OpenAI-compatible API
299
+ effectiveRerankProvider = 'siliconflow';
274
300
  } else {
275
- effectiveRerankProvider = sfApiKey ? 'openai' : (gmApiKey ? 'gemini' : (oaApiKey ? 'openai' : undefined));
301
+ effectiveRerankProvider = sfApiKey ? 'siliconflow' : (gmApiKey ? 'gemini' : (oaApiKey ? 'openai' : undefined));
276
302
  }
277
303
  }
278
304
  const effectiveEmbedProvider = embedProvider || (sfApiKey ? 'siliconflow' : (oaApiKey ? 'openai' : undefined));
@@ -283,6 +309,7 @@ function getRemoteLLM(): RemoteLLM | null {
283
309
 
284
310
  const config: RemoteLLMConfig = {
285
311
  rerankProvider: effectiveRerankProvider || 'siliconflow',
312
+ rerankMode: rerankMode as 'llm' | 'rerank',
286
313
  embedProvider: effectiveEmbedProvider,
287
314
  queryExpansionProvider: effectiveQueryExpansionProvider,
288
315
  };
@@ -1,77 +0,0 @@
1
- name: Release
2
-
3
- on:
4
- push:
5
- tags:
6
- - 'v*'
7
-
8
- permissions:
9
- contents: write
10
-
11
- jobs:
12
- build:
13
- strategy:
14
- matrix:
15
- include:
16
- - os: macos-latest
17
- target: darwin-arm64
18
- artifact: qmd-darwin-arm64
19
- - os: macos-13
20
- target: darwin-x64
21
- artifact: qmd-darwin-x64
22
- - os: ubuntu-latest
23
- target: linux-x64
24
- artifact: qmd-linux-x64
25
- - os: ubuntu-latest
26
- target: linux-arm64
27
- artifact: qmd-linux-arm64
28
-
29
- runs-on: ${{ matrix.os }}
30
-
31
- steps:
32
- - uses: actions/checkout@v4
33
-
34
- - uses: oven-sh/setup-bun@v2
35
- with:
36
- bun-version: latest
37
-
38
- - name: Install dependencies
39
- run: bun install
40
-
41
- - name: Build binary
42
- run: |
43
- if [ "${{ matrix.target }}" = "linux-arm64" ]; then
44
- bun build --compile --target=bun-linux-arm64 src/qmd.ts --outfile ${{ matrix.artifact }}
45
- else
46
- bun build --compile src/qmd.ts --outfile ${{ matrix.artifact }}
47
- fi
48
-
49
- - name: Test binary
50
- if: matrix.target != 'linux-arm64'
51
- run: |
52
- ./${{ matrix.artifact }} --help
53
- ./${{ matrix.artifact }} status 2>/dev/null || true
54
- echo "✅ Binary runs successfully"
55
-
56
- - uses: actions/upload-artifact@v4
57
- with:
58
- name: ${{ matrix.artifact }}
59
- path: ${{ matrix.artifact }}
60
-
61
- release:
62
- needs: build
63
- runs-on: ubuntu-latest
64
- steps:
65
- - uses: actions/download-artifact@v4
66
- with:
67
- merge-multiple: true
68
-
69
- - name: Make binaries executable
70
- run: chmod +x qmd-*
71
-
72
- - name: Create Release
73
- uses: softprops/action-gh-release@v2
74
- with:
75
- files: qmd-*
76
- generate_release_notes: true
77
- draft: false
@@ -1,157 +0,0 @@
1
- # QMD Training Data - Balanced Distribution Summary
2
-
3
- ## Overview
4
-
5
- The training data has been rebalanced to reduce excessive tech focus while maintaining adequate technical coverage for QMD's use case. The new distribution emphasizes diverse life topics while keeping tech at a reasonable 15%.
6
-
7
- ## Distribution Comparison
8
-
9
- ### Before (Original Data)
10
- ```
11
- Technical: ~50% ████████████████████████████████████████
12
- How-to: ~45% █████████████████████████████████████
13
- What-is: ~40% █████████████████████████████████
14
- Other: ~15% ████████████
15
- Short queries: 10% ████████
16
- Temporal: 1.6% █
17
- Named entities: 3.4% ██
18
- ```
19
-
20
- ### After (Balanced Approach)
21
- ```
22
- Category Percentage
23
- ────────────────────────────────────────
24
- Health & Wellness 12% █████████
25
- Finance & Business 12% █████████
26
- Technology 15% ███████████
27
- Home & Garden 10% ████████
28
- Food & Cooking 10% ████████
29
- Travel & Geography 10% ████████
30
- Hobbies & Crafts 10% ████████
31
- Education & Learning 8% ██████
32
- Arts & Culture 8% ██████
33
- Lifestyle & Relationships 5% ████
34
- ────────────────────────────────────────
35
- Short queries (1-2 words): 20%
36
- Temporal (2025/2026): 15%
37
- Named entities: 10%+
38
- ```
39
-
40
- ## Key Improvements
41
-
42
- ### 1. Category Diversity
43
-
44
- **New Non-Tech Categories Added:**
45
- - **Health & Wellness**: Meditation, fitness, nutrition, mental health
46
- - **Finance & Business**: Budgeting, investing, career, entrepreneurship
47
- - **Home & Garden**: DIY, repairs, cleaning, gardening, organization
48
- - **Food & Cooking**: Recipes, techniques, meal planning, nutrition
49
- - **Travel & Geography**: Travel planning, destinations, geography facts
50
- - **Hobbies & Crafts**: Photography, art, music, woodworking, knitting
51
- - **Education & Learning**: Study techniques, languages, online courses
52
- - **Arts & Culture**: Art history, music, film, theater, literature
53
- - **Lifestyle & Relationships**: Habits, relationships, parenting, minimalism
54
-
55
- ### 2. Temporal Queries (2025/2026)
56
-
57
- Updated to use current era years for recency queries:
58
- - "latest research 2026"
59
- - "Shopify updates 2025"
60
- - "what changed in React 2026"
61
- - "AI developments 2025"
62
-
63
- This ensures the model learns to handle queries from the current time period.
64
-
65
- ### 3. Short Query Coverage
66
-
67
- Expanded from 47 to 144+ short keywords across all categories:
68
- - Tech: auth, config, api, cache, deploy
69
- - Health: meditate, hydrate, stretch, exercise
70
- - Finance: budget, save, invest, taxes
71
- - Home: clean, organize, repair, garden
72
- - Food: cook, bake, recipe, meal
73
- - Travel: travel, pack, passport, hotel
74
- - Hobbies: photo, draw, paint, knit, guitar
75
- - Education: study, learn, course, exam
76
- - Arts: art, music, film, dance
77
- - Life: habit, routine, organize, parent
78
-
79
- ## Usage
80
-
81
- ### Quick Start - Use Balanced Data
82
-
83
- ```bash
84
- cd finetune
85
-
86
- # Add 500 balanced examples
87
- cat data/qmd_expansion_balanced.jsonl >> data/qmd_expansion_v2.jsonl
88
-
89
- # Prepare with enhanced short query templates
90
- uv run dataset/prepare_data.py --add-short 2
91
-
92
- # Train
93
- uv run train.py sft --config configs/sft.yaml
94
- ```
95
-
96
- ### Generate Fresh Data with Claude API
97
-
98
- ```bash
99
- # Set API key
100
- export ANTHROPIC_API_KEY=your_key
101
-
102
- # Generate 300 balanced examples
103
- uv run dataset/generate_data.py --count 300 \
104
- --output data/qmd_expansion_fresh.jsonl
105
-
106
- # Analyze distribution
107
- uv run dataset/analyze_data.py --input data/qmd_expansion_fresh.jsonl
108
-
109
- # Prepare for training
110
- uv run dataset/prepare_data.py --input data/qmd_expansion_fresh.jsonl
111
- ```
112
-
113
- ### Generate Even More Balanced Examples
114
-
115
- ```bash
116
- # Generate 500 life-focused examples (15% tech)
117
- uv run dataset/generate_balanced.py
118
-
119
- # Or generate 265 additional diverse examples
120
- uv run dataset/generate_diverse.py
121
- ```
122
-
123
- ## File Summary
124
-
125
- ### Modified Files:
126
- - `dataset/generate_data.py` - Added category weights (15% tech), 2025/2026 dates
127
- - `dataset/prepare_data.py` - Expanded SHORT_QUERIES from 47→144, templates 5→16
128
-
129
- ### New Files:
130
- - `dataset/generate_balanced.py` - Life-focused generator (500 examples)
131
- - `dataset/generate_diverse.py` - Philosophy/History/Geography/Trivia generator (265 examples)
132
- - `dataset/analyze_data.py` - Dataset analysis and quality reporting
133
- - `DATA_IMPROVEMENTS.md` - Detailed improvement documentation
134
-
135
- ### Generated Data:
136
- - `data/qmd_expansion_balanced.jsonl` - 500 balanced examples
137
- - `data/qmd_expansion_diverse_addon.jsonl` - 265 diverse examples
138
-
139
- ## Expected Benefits
140
-
141
- 1. **Better Short Query Handling**: 20% coverage vs 10% before
142
- 2. **Named Entity Preservation**: 10%+ coverage vs 3.4% before
143
- 3. **Temporal Understanding**: 15% with 2025/2026 vs 1.6% before
144
- 4. **Domain Diversity**: 10 categories vs tech-only before
145
- 5. **Life-Document Search**: Better at searching personal notes on health, finance, hobbies
146
-
147
- ## Next Steps
148
-
149
- 1. Merge balanced examples into training set
150
- 2. Retrain model with improved distribution
151
- 3. Evaluate using `evals/queries.txt`
152
- 4. Monitor scores on temporal/named-entity/short queries
153
- 5. Iterate based on results
154
-
155
- ---
156
-
157
- Generated: 2026-01-30