qmdr 1.0.0 → 1.0.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AI-SETUP.md +61 -98
- package/docs/setup-openclaw.md +20 -3
- package/package.json +1 -1
- package/qmd-wrapper-backup +12 -0
- package/src/llm.ts +28 -1
- package/src/qmd.ts +29 -2
- package/.github/workflows/release.yml +0 -77
- package/finetune/BALANCED_DISTRIBUTION.md +0 -157
- package/finetune/DATA_IMPROVEMENTS.md +0 -218
- package/finetune/Justfile +0 -43
- package/finetune/Modelfile +0 -16
- package/finetune/README.md +0 -299
- package/finetune/SCORING.md +0 -286
- package/finetune/configs/accelerate_multi_gpu.yaml +0 -17
- package/finetune/configs/grpo.yaml +0 -49
- package/finetune/configs/sft.yaml +0 -42
- package/finetune/configs/sft_local.yaml +0 -40
- package/finetune/convert_gguf.py +0 -221
- package/finetune/data/best_glm_prompt.txt +0 -17
- package/finetune/data/gepa_generated.prompts.json +0 -32
- package/finetune/data/qmd_expansion_balanced_deduped.jsonl +0 -413
- package/finetune/data/qmd_expansion_diverse_addon.jsonl +0 -386
- package/finetune/data/qmd_expansion_handcrafted.jsonl +0 -65
- package/finetune/data/qmd_expansion_handcrafted_only.jsonl +0 -336
- package/finetune/data/qmd_expansion_locations.jsonl +0 -64
- package/finetune/data/qmd_expansion_people.jsonl +0 -46
- package/finetune/data/qmd_expansion_short_nontech.jsonl +0 -200
- package/finetune/data/qmd_expansion_v2.jsonl +0 -1498
- package/finetune/data/qmd_only_sampled.jsonl +0 -399
- package/finetune/dataset/analyze_data.py +0 -369
- package/finetune/dataset/clean_data.py +0 -906
- package/finetune/dataset/generate_balanced.py +0 -823
- package/finetune/dataset/generate_data.py +0 -714
- package/finetune/dataset/generate_data_offline.py +0 -206
- package/finetune/dataset/generate_diverse.py +0 -441
- package/finetune/dataset/generate_ollama.py +0 -326
- package/finetune/dataset/prepare_data.py +0 -197
- package/finetune/dataset/schema.py +0 -73
- package/finetune/dataset/score_data.py +0 -115
- package/finetune/dataset/validate_schema.py +0 -104
- package/finetune/eval.py +0 -196
- package/finetune/evals/queries.txt +0 -56
- package/finetune/gepa/__init__.py +0 -1
- package/finetune/gepa/best_prompt.txt +0 -31
- package/finetune/gepa/best_prompt_glm.txt +0 -1
- package/finetune/gepa/dspy_gepa.py +0 -204
- package/finetune/gepa/example.py +0 -117
- package/finetune/gepa/generate.py +0 -129
- package/finetune/gepa/gepa_outputs.jsonl +0 -10
- package/finetune/gepa/gepa_outputs_glm.jsonl +0 -20
- package/finetune/gepa/model.json +0 -19
- package/finetune/gepa/optimizer.py +0 -70
- package/finetune/gepa/score.py +0 -84
- package/finetune/jobs/eval.py +0 -490
- package/finetune/jobs/eval_common.py +0 -354
- package/finetune/jobs/eval_verbose.py +0 -113
- package/finetune/jobs/grpo.py +0 -141
- package/finetune/jobs/quantize.py +0 -244
- package/finetune/jobs/sft.py +0 -121
- package/finetune/pyproject.toml +0 -23
- package/finetune/reward.py +0 -610
- package/finetune/train.py +0 -611
- package/finetune/uv.lock +0 -4070
- package/src/cli.test.ts +0 -963
- package/src/eval.test.ts +0 -412
- package/src/llm.test.ts +0 -559
- package/src/mcp.test.ts +0 -889
- package/src/store-paths.test.ts +0 -395
- package/src/store.test.ts +0 -2483
- package/test/eval-harness.ts +0 -223
package/AI-SETUP.md
CHANGED
|
@@ -112,9 +112,16 @@ chmod +x /usr/local/bin/qmd
|
|
|
112
112
|
|
|
113
113
|
---
|
|
114
114
|
|
|
115
|
-
## Phase 3: Configure
|
|
115
|
+
## Phase 3: Configure API Keys
|
|
116
116
|
|
|
117
|
-
|
|
117
|
+
All configuration goes into one file: `~/.config/qmd/.env`
|
|
118
|
+
|
|
119
|
+
QMDR automatically reads this file on startup (both Bun and binary installs). Create it:
|
|
120
|
+
|
|
121
|
+
```bash
|
|
122
|
+
mkdir -p ~/.config/qmd
|
|
123
|
+
nano ~/.config/qmd/.env # or any editor
|
|
124
|
+
```
|
|
118
125
|
|
|
119
126
|
> **Do you have a SiliconFlow API key?**
|
|
120
127
|
>
|
|
@@ -124,8 +131,9 @@ SiliconFlow provides the embedding model. Ask:
|
|
|
124
131
|
|
|
125
132
|
### Option 1: User has key
|
|
126
133
|
|
|
127
|
-
|
|
128
|
-
|
|
134
|
+
Write to `~/.config/qmd/.env`:
|
|
135
|
+
```
|
|
136
|
+
QMD_SILICONFLOW_API_KEY=sk-user-provided-key
|
|
129
137
|
```
|
|
130
138
|
|
|
131
139
|
### Option 2: Guide to get key
|
|
@@ -142,76 +150,45 @@ export QMD_SILICONFLOW_API_KEY=sk-user-provided-key
|
|
|
142
150
|
> 2. Go to API Keys page → Create new key
|
|
143
151
|
> 3. Free tier includes embedding models
|
|
144
152
|
|
|
145
|
-
After user provides key
|
|
146
|
-
```
|
|
147
|
-
|
|
153
|
+
After user provides key, write to `~/.config/qmd/.env`:
|
|
154
|
+
```
|
|
155
|
+
QMD_SILICONFLOW_API_KEY=sk-user-provided-key
|
|
148
156
|
```
|
|
149
157
|
|
|
150
|
-
**International users —
|
|
151
|
-
```
|
|
152
|
-
|
|
158
|
+
**International users — add this line too:**
|
|
159
|
+
```
|
|
160
|
+
QMD_SILICONFLOW_BASE_URL=https://api.siliconflow.com/v1
|
|
153
161
|
```
|
|
154
162
|
(China users don't need this — the default `https://api.siliconflow.cn/v1` works)
|
|
155
163
|
|
|
156
164
|
### Option 3: Custom provider
|
|
157
165
|
|
|
158
|
-
Ask for
|
|
159
|
-
1. API endpoint URL (must be OpenAI-compatible)
|
|
160
|
-
2. API key
|
|
161
|
-
|
|
162
|
-
```bash
|
|
163
|
-
export QMD_OPENAI_API_KEY=user-key
|
|
164
|
-
export QMD_OPENAI_BASE_URL=https://their-endpoint.com/v1
|
|
165
|
-
export QMD_EMBED_PROVIDER=openai
|
|
166
|
+
Ask for endpoint URL and API key, then write to `~/.config/qmd/.env`:
|
|
166
167
|
```
|
|
167
|
-
|
|
168
|
-
|
|
169
|
-
|
|
170
|
-
curl -s "$QMD_OPENAI_BASE_URL/models" -H "Authorization: Bearer $QMD_OPENAI_API_KEY" | head -5
|
|
168
|
+
QMD_OPENAI_API_KEY=user-key
|
|
169
|
+
QMD_OPENAI_BASE_URL=https://their-endpoint.com/v1
|
|
170
|
+
QMD_EMBED_PROVIDER=openai
|
|
171
171
|
```
|
|
172
172
|
|
|
173
173
|
### Configure embedding model
|
|
174
174
|
|
|
175
175
|
Default: `Qwen/Qwen3-Embedding-8B` (on SiliconFlow, free)
|
|
176
176
|
|
|
177
|
-
|
|
178
|
-
> Keep the default embedding model (`Qwen/Qwen3-Embedding-8B`), or choose your own?
|
|
179
|
-
>
|
|
180
|
-
> If you don't know which to pick, the default is good.
|
|
177
|
+
> Keep the default embedding model, or choose your own?
|
|
181
178
|
|
|
182
|
-
If user wants to change
|
|
183
|
-
```
|
|
184
|
-
|
|
179
|
+
If user wants to change, add to `.env`:
|
|
180
|
+
```
|
|
181
|
+
QMD_SILICONFLOW_EMBED_MODEL=their-chosen-model
|
|
185
182
|
# or for custom provider:
|
|
186
|
-
|
|
183
|
+
QMD_OPENAI_EMBED_MODEL=their-chosen-model
|
|
187
184
|
```
|
|
188
185
|
|
|
189
|
-
### Configure embedding dimensions
|
|
190
|
-
|
|
191
|
-
> Do you know the output dimensions of your embedding model?
|
|
192
|
-
>
|
|
193
|
-
> 1. **Use default** (auto-detect from model — recommended)
|
|
194
|
-
> 2. **I know the dimensions** — let me enter it
|
|
195
|
-
> 3. **I don't know** — please look it up for me
|
|
196
|
-
|
|
197
|
-
If option 3: search the web for "{model_name} embedding dimensions" and inform the user.
|
|
198
|
-
|
|
199
|
-
Note: If dimensions change after initial indexing, user must run `qmd embed -f` to rebuild.
|
|
200
|
-
|
|
201
186
|
### Configure chunk size
|
|
202
187
|
|
|
203
|
-
Default: `200` tokens per chunk, `40` tokens overlap.
|
|
204
|
-
|
|
205
|
-
|
|
206
|
-
|
|
207
|
-
> - **Default: 200 tokens** (recommended for most use cases)
|
|
208
|
-
> - Larger chunks = more context per result, but less precise
|
|
209
|
-
> - Smaller chunks = more precise, but may lose context
|
|
210
|
-
|
|
211
|
-
```bash
|
|
212
|
-
# Only if user wants to change defaults:
|
|
213
|
-
export QMD_CHUNK_SIZE_TOKENS=200 # default
|
|
214
|
-
export QMD_CHUNK_OVERLAP_TOKENS=40 # default
|
|
188
|
+
Default: `200` tokens per chunk, `40` tokens overlap. Only add to `.env` if user wants to change:
|
|
189
|
+
```
|
|
190
|
+
QMD_CHUNK_SIZE_TOKENS=200
|
|
191
|
+
QMD_CHUNK_OVERLAP_TOKENS=40
|
|
215
192
|
```
|
|
216
193
|
|
|
217
194
|
---
|
|
@@ -222,22 +199,19 @@ Query expansion rewrites the user's search query into multiple variations (keywo
|
|
|
222
199
|
|
|
223
200
|
Default: `GLM-4.5-Air` on SiliconFlow (~¥1/month, fast, good quality).
|
|
224
201
|
|
|
225
|
-
|
|
226
|
-
> Use the default query expansion model (`GLM-4.5-Air` on SiliconFlow), or use your own?
|
|
202
|
+
> Use the default query expansion model, or use your own?
|
|
227
203
|
>
|
|
228
204
|
> 1. **Default** (GLM-4.5-Air on SiliconFlow — recommended)
|
|
229
205
|
> 2. **Use the reranking provider's model** (configured in next step)
|
|
230
206
|
> 3. **Custom** — I want to specify a model
|
|
231
207
|
|
|
232
|
-
If option
|
|
233
|
-
|
|
234
|
-
|
|
235
|
-
|
|
236
|
-
|
|
237
|
-
|
|
238
|
-
|
|
239
|
-
# For Gemini:
|
|
240
|
-
export QMD_GEMINI_API_KEY=their-key
|
|
208
|
+
If option 3, add to `~/.config/qmd/.env`:
|
|
209
|
+
```
|
|
210
|
+
QMD_QUERY_EXPANSION_PROVIDER=openai
|
|
211
|
+
QMD_OPENAI_MODEL=their-model-name
|
|
212
|
+
# Or for Gemini:
|
|
213
|
+
QMD_QUERY_EXPANSION_PROVIDER=gemini
|
|
214
|
+
QMD_GEMINI_API_KEY=their-key
|
|
241
215
|
```
|
|
242
216
|
|
|
243
217
|
---
|
|
@@ -256,9 +230,10 @@ Reranking uses a large language model to judge which search results are truly re
|
|
|
256
230
|
|
|
257
231
|
### Option 1: User has Gemini key
|
|
258
232
|
|
|
259
|
-
|
|
260
|
-
|
|
261
|
-
|
|
233
|
+
Add to `~/.config/qmd/.env`:
|
|
234
|
+
```
|
|
235
|
+
QMD_GEMINI_API_KEY=user-key
|
|
236
|
+
QMD_RERANK_PROVIDER=gemini
|
|
262
237
|
```
|
|
263
238
|
|
|
264
239
|
### Option 2: Guide to get Gemini key
|
|
@@ -268,48 +243,34 @@ export QMD_RERANK_PROVIDER=gemini
|
|
|
268
243
|
> 2. Click "Get API key" → Create key
|
|
269
244
|
> 3. Free tier: 15 RPM / 1M tokens per day (more than enough for reranking)
|
|
270
245
|
|
|
271
|
-
Note for China users: Gemini API may require a proxy.
|
|
272
|
-
```
|
|
273
|
-
|
|
246
|
+
Note for China users: Gemini API may require a proxy. Add to `.env`:
|
|
247
|
+
```
|
|
248
|
+
QMD_GEMINI_BASE_URL=https://their-proxy-endpoint
|
|
274
249
|
```
|
|
275
250
|
|
|
276
251
|
If no proxy available, recommend Option 3 instead.
|
|
277
252
|
|
|
278
253
|
### Option 3: Alternative reranking
|
|
279
254
|
|
|
255
|
+
Add to `~/.config/qmd/.env`:
|
|
256
|
+
|
|
280
257
|
**Using SiliconFlow LLM rerank (no extra key needed):**
|
|
281
|
-
```
|
|
282
|
-
|
|
283
|
-
|
|
258
|
+
```
|
|
259
|
+
QMD_RERANK_PROVIDER=siliconflow
|
|
260
|
+
QMD_RERANK_MODE=llm
|
|
284
261
|
```
|
|
285
262
|
|
|
286
|
-
**Using a dedicated rerank model API
|
|
287
|
-
```
|
|
288
|
-
|
|
289
|
-
|
|
290
|
-
|
|
263
|
+
**Using a dedicated rerank model API:**
|
|
264
|
+
```
|
|
265
|
+
QMD_RERANK_PROVIDER=siliconflow
|
|
266
|
+
QMD_RERANK_MODE=rerank
|
|
267
|
+
QMD_SILICONFLOW_RERANK_MODEL=BAAI/bge-reranker-v2-m3
|
|
291
268
|
```
|
|
292
269
|
|
|
293
270
|
**Using OpenAI-compatible endpoint:**
|
|
294
|
-
```bash
|
|
295
|
-
export QMD_RERANK_PROVIDER=openai
|
|
296
|
-
export QMD_RERANK_MODE=llm
|
|
297
|
-
export QMD_OPENAI_API_KEY=their-key
|
|
298
|
-
export QMD_OPENAI_BASE_URL=https://their-endpoint/v1
|
|
299
271
|
```
|
|
300
|
-
|
|
301
|
-
|
|
302
|
-
|
|
303
|
-
### Model selection
|
|
304
|
-
|
|
305
|
-
Default reranking model: `gemini-2.5-flash` with `thinkingBudget: 0` (no reasoning overhead, pure relevance judgment).
|
|
306
|
-
|
|
307
|
-
> Keep the default reranking model (`gemini-2.5-flash`), or change it?
|
|
308
|
-
|
|
309
|
-
If user wants to change:
|
|
310
|
-
```bash
|
|
311
|
-
export QMD_GEMINI_MODEL=their-model # for Gemini provider
|
|
312
|
-
export QMD_LLM_RERANK_MODEL=their-model # for SiliconFlow/OpenAI LLM rerank
|
|
272
|
+
QMD_RERANK_PROVIDER=openai
|
|
273
|
+
QMD_RERANK_MODE=llm
|
|
313
274
|
```
|
|
314
275
|
|
|
315
276
|
---
|
|
@@ -414,7 +375,7 @@ After setup, add this to `AGENTS.md` in the project root:
|
|
|
414
375
|
## Environment Variables Reference
|
|
415
376
|
|
|
416
377
|
| Variable | Default | Purpose |
|
|
417
|
-
|
|
378
|
+
|----------|---------|---------|
|
|
418
379
|
| **API Keys** | | |
|
|
419
380
|
| `QMD_SILICONFLOW_API_KEY` | — | SiliconFlow |
|
|
420
381
|
| `QMD_GEMINI_API_KEY` | — | Google Gemini |
|
|
@@ -443,6 +404,8 @@ After setup, add this to `AGENTS.md` in the project root:
|
|
|
443
404
|
| `QMD_RERANK_DOC_LIMIT` | `40` | Max docs for reranking |
|
|
444
405
|
| `QMD_RERANK_CHUNKS_PER_DOC` | `3` | Chunks per doc for reranking |
|
|
445
406
|
| **Paths** | | |
|
|
407
|
+
| `QMD_CONFIG_DIR` | `~/.config/qmd` | Config directory (index.yml + .env location) |
|
|
408
|
+
| `XDG_CACHE_HOME` | `~/.cache` | Cache directory (database at `$XDG_CACHE_HOME/qmd/index.sqlite`) |
|
|
446
409
|
| `QMD_SQLITE_VEC_PATH` | auto | sqlite-vec native extension path |
|
|
447
410
|
|
|
448
411
|
**If you change the embedding model, run `qmd embed -f` to rebuild the vector index.**
|
package/docs/setup-openclaw.md
CHANGED
|
@@ -79,7 +79,11 @@ Edit `~/.openclaw/openclaw.json` — add/merge the `memory` block:
|
|
|
79
79
|
|
|
80
80
|
API keys must be available to the OpenClaw process, not just your shell.
|
|
81
81
|
|
|
82
|
-
**
|
|
82
|
+
**Option A: Use `~/.config/qmd/.env` (recommended — works for all install types)**
|
|
83
|
+
|
|
84
|
+
QMDR auto-loads this file on startup. If you already configured it in Phase 3-5, you're done — OpenClaw will inherit these settings automatically.
|
|
85
|
+
|
|
86
|
+
**Option B: macOS (launchd) — set in the service plist:**
|
|
83
87
|
Add to `~/Library/LaunchAgents/ai.openclaw.gateway.plist` under `EnvironmentVariables`:
|
|
84
88
|
```xml
|
|
85
89
|
<key>QMD_SILICONFLOW_API_KEY</key>
|
|
@@ -89,7 +93,7 @@ Add to `~/Library/LaunchAgents/ai.openclaw.gateway.plist` under `EnvironmentVari
|
|
|
89
93
|
```
|
|
90
94
|
Then reload: `launchctl unload ~/Library/LaunchAgents/ai.openclaw.gateway.plist && launchctl load ~/Library/LaunchAgents/ai.openclaw.gateway.plist`
|
|
91
95
|
|
|
92
|
-
**Linux (systemd):**
|
|
96
|
+
**Option C: Linux (systemd) — set in the service unit:**
|
|
93
97
|
Add to the service unit under `[Service]`:
|
|
94
98
|
```ini
|
|
95
99
|
Environment=QMD_SILICONFLOW_API_KEY=sk-your-key
|
|
@@ -97,13 +101,26 @@ Environment=QMD_GEMINI_API_KEY=your-gemini-key
|
|
|
97
101
|
```
|
|
98
102
|
Then: `systemctl --user daemon-reload && systemctl --user restart openclaw`
|
|
99
103
|
|
|
104
|
+
**Priority:** System/process env vars > `~/.config/qmd/.env` (env vars override .env if both are set).
|
|
105
|
+
|
|
100
106
|
## Step 5: Restart and verify
|
|
101
107
|
|
|
108
|
+
Before restarting, confirm:
|
|
109
|
+
1. `qmd doctor` passes with no errors ✅
|
|
110
|
+
2. `qmd query "test"` returns results ✅
|
|
111
|
+
3. API keys are in launchd/systemd env (Step 4) ✅
|
|
112
|
+
|
|
102
113
|
```bash
|
|
103
114
|
openclaw gateway restart
|
|
115
|
+
|
|
116
|
+
# Or if using OpenClaw CLI:
|
|
117
|
+
openclaw gateway stop && openclaw gateway start
|
|
104
118
|
```
|
|
105
119
|
|
|
106
|
-
|
|
120
|
+
After restart, verify QMDR is active:
|
|
121
|
+
- Ask your bot about something from past conversations
|
|
122
|
+
- Check that `memory_search` results show `"provider": "qmd"` (not `"sqlite"`)
|
|
123
|
+
- If results seem to fall back to basic search, check `qmd doctor` inside the OpenClaw process environment
|
|
107
124
|
|
|
108
125
|
## Step 6: Add usage tips to TOOLS.md
|
|
109
126
|
|
package/package.json
CHANGED
|
@@ -0,0 +1,12 @@
|
|
|
1
|
+
#!/bin/bash
|
|
2
|
+
# QMD wrapper - runs via bun instead of compiled binary
|
|
3
|
+
# When --json flag is present, redirect stderr separately to avoid breaking JSON output
|
|
4
|
+
# The debug console.log calls in qmd go to stdout; we need to filter them
|
|
5
|
+
if echo "$@" | grep -q '\-\-json'; then
|
|
6
|
+
# Run and extract only the JSON array from stdout
|
|
7
|
+
output=$(bun run /Users/fu/clawd/qmd/src/qmd.ts "$@" 2>/dev/null)
|
|
8
|
+
# Extract JSON: find the line starting with [ and everything after
|
|
9
|
+
echo "$output" | sed -n '/^\[/,$ p'
|
|
10
|
+
else
|
|
11
|
+
exec bun run /Users/fu/clawd/qmd/src/qmd.ts "$@"
|
|
12
|
+
fi
|
package/src/llm.ts
CHANGED
|
@@ -26,7 +26,16 @@ type LlamaModel = any;
|
|
|
26
26
|
type LlamaEmbeddingContext = any;
|
|
27
27
|
type LlamaToken = any;
|
|
28
28
|
let LlamaChatSession: any = null;
|
|
29
|
-
|
|
29
|
+
// Lazy-init LlamaChatSession only when actually needed (avoid eager import crash on Linux CI)
|
|
30
|
+
async function ensureLlamaChatSession() {
|
|
31
|
+
if (!LlamaChatSession) {
|
|
32
|
+
try {
|
|
33
|
+
const m = await getLlamaCpp();
|
|
34
|
+
LlamaChatSession = m.LlamaChatSession;
|
|
35
|
+
} catch {}
|
|
36
|
+
}
|
|
37
|
+
return LlamaChatSession;
|
|
38
|
+
}
|
|
30
39
|
import { homedir } from "os";
|
|
31
40
|
import { join } from "path";
|
|
32
41
|
import { existsSync, mkdirSync, statSync, unlinkSync, readdirSync, readFileSync, writeFileSync } from "fs";
|
|
@@ -964,6 +973,7 @@ export class LlamaCpp implements LLM {
|
|
|
964
973
|
|
|
965
974
|
export type RemoteLLMConfig = {
|
|
966
975
|
rerankProvider: 'siliconflow' | 'gemini' | 'openai';
|
|
976
|
+
rerankMode?: 'llm' | 'rerank'; // 'llm' = chat model, 'rerank' = dedicated rerank API
|
|
967
977
|
embedProvider?: 'siliconflow' | 'openai'; // remote embedding provider (optional)
|
|
968
978
|
queryExpansionProvider?: 'siliconflow' | 'gemini' | 'openai'; // remote query expansion (optional)
|
|
969
979
|
siliconflow?: {
|
|
@@ -1040,6 +1050,23 @@ export class RemoteLLM implements LLM {
|
|
|
1040
1050
|
options: RerankOptions = {}
|
|
1041
1051
|
): Promise<RerankResult> {
|
|
1042
1052
|
if (this.config.rerankProvider === 'siliconflow') {
|
|
1053
|
+
// LLM mode: use SiliconFlow's OpenAI-compatible chat API for reranking
|
|
1054
|
+
if (this.config.rerankMode === 'llm') {
|
|
1055
|
+
// Build a temporary openai-like config from siliconflow settings
|
|
1056
|
+
const sf = this.config.siliconflow;
|
|
1057
|
+
if (!sf?.apiKey) throw new Error("SiliconFlow API key required for LLM rerank");
|
|
1058
|
+
const savedOpenai = this.config.openai;
|
|
1059
|
+
this.config.openai = {
|
|
1060
|
+
apiKey: sf.apiKey,
|
|
1061
|
+
baseUrl: (sf.baseUrl || "https://api.siliconflow.cn/v1").replace(/\/$/, ""),
|
|
1062
|
+
model: sf.queryExpansionModel || "zai-org/GLM-4.5-Air",
|
|
1063
|
+
};
|
|
1064
|
+
try {
|
|
1065
|
+
return await this.rerankWithOpenAI(query, documents, options);
|
|
1066
|
+
} finally {
|
|
1067
|
+
this.config.openai = savedOpenai;
|
|
1068
|
+
}
|
|
1069
|
+
}
|
|
1043
1070
|
return this.rerankWithSiliconflow(query, documents, options);
|
|
1044
1071
|
}
|
|
1045
1072
|
if (this.config.rerankProvider === 'openai') {
|
package/src/qmd.ts
CHANGED
|
@@ -90,6 +90,31 @@ import { handleCollectionCommand } from "./app/commands/collection.js";
|
|
|
90
90
|
import { handleSearchCommand, handleVSearchCommand, handleQueryCommand } from "./app/commands/search.js";
|
|
91
91
|
import { handleCleanupCommand, handlePullCommand, handleStatusCommand, handleUpdateCommand, handleEmbedCommand, handleMcpCommand, handleDoctorCommand } from "./app/commands/maintenance.js";
|
|
92
92
|
import { createLLMService } from "./app/services/llm-service.js";
|
|
93
|
+
import { existsSync } from "fs";
|
|
94
|
+
import { join } from "path";
|
|
95
|
+
|
|
96
|
+
// =============================================================================
|
|
97
|
+
// Load config from ~/.config/qmd/.env (single source of truth)
|
|
98
|
+
// Environment variables set externally take priority (won't be overwritten)
|
|
99
|
+
// =============================================================================
|
|
100
|
+
|
|
101
|
+
const qmdConfigDir = process.env.QMD_CONFIG_DIR || join(homedir(), ".config", "qmd");
|
|
102
|
+
const qmdEnvPath = join(qmdConfigDir, ".env");
|
|
103
|
+
if (existsSync(qmdEnvPath)) {
|
|
104
|
+
const envContent = readFileSync(qmdEnvPath, "utf-8");
|
|
105
|
+
for (const line of envContent.split("\n")) {
|
|
106
|
+
const trimmed = line.trim();
|
|
107
|
+
if (!trimmed || trimmed.startsWith("#")) continue;
|
|
108
|
+
const eqIdx = trimmed.indexOf("=");
|
|
109
|
+
if (eqIdx === -1) continue;
|
|
110
|
+
const key = trimmed.slice(0, eqIdx).trim();
|
|
111
|
+
const val = trimmed.slice(eqIdx + 1).trim().replace(/^["']|["']$/g, "");
|
|
112
|
+
// Don't override existing env vars (system/process env takes priority)
|
|
113
|
+
if (!process.env[key]) {
|
|
114
|
+
process.env[key] = val;
|
|
115
|
+
}
|
|
116
|
+
}
|
|
117
|
+
}
|
|
93
118
|
|
|
94
119
|
// Enable production mode - allows using default database path
|
|
95
120
|
// Tests must set INDEX_PATH or use createStore() with explicit path
|
|
@@ -270,9 +295,10 @@ function getRemoteLLM(): RemoteLLM | null {
|
|
|
270
295
|
if (rerankProvider === 'gemini' || rerankProvider === 'openai') {
|
|
271
296
|
effectiveRerankProvider = rerankProvider;
|
|
272
297
|
} else if (rerankProvider === 'siliconflow') {
|
|
273
|
-
|
|
298
|
+
// LLM rerank via SiliconFlow's OpenAI-compatible API
|
|
299
|
+
effectiveRerankProvider = 'siliconflow';
|
|
274
300
|
} else {
|
|
275
|
-
effectiveRerankProvider = sfApiKey ? '
|
|
301
|
+
effectiveRerankProvider = sfApiKey ? 'siliconflow' : (gmApiKey ? 'gemini' : (oaApiKey ? 'openai' : undefined));
|
|
276
302
|
}
|
|
277
303
|
}
|
|
278
304
|
const effectiveEmbedProvider = embedProvider || (sfApiKey ? 'siliconflow' : (oaApiKey ? 'openai' : undefined));
|
|
@@ -283,6 +309,7 @@ function getRemoteLLM(): RemoteLLM | null {
|
|
|
283
309
|
|
|
284
310
|
const config: RemoteLLMConfig = {
|
|
285
311
|
rerankProvider: effectiveRerankProvider || 'siliconflow',
|
|
312
|
+
rerankMode: rerankMode as 'llm' | 'rerank',
|
|
286
313
|
embedProvider: effectiveEmbedProvider,
|
|
287
314
|
queryExpansionProvider: effectiveQueryExpansionProvider,
|
|
288
315
|
};
|
|
@@ -1,77 +0,0 @@
|
|
|
1
|
-
name: Release
|
|
2
|
-
|
|
3
|
-
on:
|
|
4
|
-
push:
|
|
5
|
-
tags:
|
|
6
|
-
- 'v*'
|
|
7
|
-
|
|
8
|
-
permissions:
|
|
9
|
-
contents: write
|
|
10
|
-
|
|
11
|
-
jobs:
|
|
12
|
-
build:
|
|
13
|
-
strategy:
|
|
14
|
-
matrix:
|
|
15
|
-
include:
|
|
16
|
-
- os: macos-latest
|
|
17
|
-
target: darwin-arm64
|
|
18
|
-
artifact: qmd-darwin-arm64
|
|
19
|
-
- os: macos-13
|
|
20
|
-
target: darwin-x64
|
|
21
|
-
artifact: qmd-darwin-x64
|
|
22
|
-
- os: ubuntu-latest
|
|
23
|
-
target: linux-x64
|
|
24
|
-
artifact: qmd-linux-x64
|
|
25
|
-
- os: ubuntu-latest
|
|
26
|
-
target: linux-arm64
|
|
27
|
-
artifact: qmd-linux-arm64
|
|
28
|
-
|
|
29
|
-
runs-on: ${{ matrix.os }}
|
|
30
|
-
|
|
31
|
-
steps:
|
|
32
|
-
- uses: actions/checkout@v4
|
|
33
|
-
|
|
34
|
-
- uses: oven-sh/setup-bun@v2
|
|
35
|
-
with:
|
|
36
|
-
bun-version: latest
|
|
37
|
-
|
|
38
|
-
- name: Install dependencies
|
|
39
|
-
run: bun install
|
|
40
|
-
|
|
41
|
-
- name: Build binary
|
|
42
|
-
run: |
|
|
43
|
-
if [ "${{ matrix.target }}" = "linux-arm64" ]; then
|
|
44
|
-
bun build --compile --target=bun-linux-arm64 src/qmd.ts --outfile ${{ matrix.artifact }}
|
|
45
|
-
else
|
|
46
|
-
bun build --compile src/qmd.ts --outfile ${{ matrix.artifact }}
|
|
47
|
-
fi
|
|
48
|
-
|
|
49
|
-
- name: Test binary
|
|
50
|
-
if: matrix.target != 'linux-arm64'
|
|
51
|
-
run: |
|
|
52
|
-
./${{ matrix.artifact }} --help
|
|
53
|
-
./${{ matrix.artifact }} status 2>/dev/null || true
|
|
54
|
-
echo "✅ Binary runs successfully"
|
|
55
|
-
|
|
56
|
-
- uses: actions/upload-artifact@v4
|
|
57
|
-
with:
|
|
58
|
-
name: ${{ matrix.artifact }}
|
|
59
|
-
path: ${{ matrix.artifact }}
|
|
60
|
-
|
|
61
|
-
release:
|
|
62
|
-
needs: build
|
|
63
|
-
runs-on: ubuntu-latest
|
|
64
|
-
steps:
|
|
65
|
-
- uses: actions/download-artifact@v4
|
|
66
|
-
with:
|
|
67
|
-
merge-multiple: true
|
|
68
|
-
|
|
69
|
-
- name: Make binaries executable
|
|
70
|
-
run: chmod +x qmd-*
|
|
71
|
-
|
|
72
|
-
- name: Create Release
|
|
73
|
-
uses: softprops/action-gh-release@v2
|
|
74
|
-
with:
|
|
75
|
-
files: qmd-*
|
|
76
|
-
generate_release_notes: true
|
|
77
|
-
draft: false
|
|
@@ -1,157 +0,0 @@
|
|
|
1
|
-
# QMD Training Data - Balanced Distribution Summary
|
|
2
|
-
|
|
3
|
-
## Overview
|
|
4
|
-
|
|
5
|
-
The training data has been rebalanced to reduce excessive tech focus while maintaining adequate technical coverage for QMD's use case. The new distribution emphasizes diverse life topics while keeping tech at a reasonable 15%.
|
|
6
|
-
|
|
7
|
-
## Distribution Comparison
|
|
8
|
-
|
|
9
|
-
### Before (Original Data)
|
|
10
|
-
```
|
|
11
|
-
Technical: ~50% ████████████████████████████████████████
|
|
12
|
-
How-to: ~45% █████████████████████████████████████
|
|
13
|
-
What-is: ~40% █████████████████████████████████
|
|
14
|
-
Other: ~15% ████████████
|
|
15
|
-
Short queries: 10% ████████
|
|
16
|
-
Temporal: 1.6% █
|
|
17
|
-
Named entities: 3.4% ██
|
|
18
|
-
```
|
|
19
|
-
|
|
20
|
-
### After (Balanced Approach)
|
|
21
|
-
```
|
|
22
|
-
Category Percentage
|
|
23
|
-
────────────────────────────────────────
|
|
24
|
-
Health & Wellness 12% █████████
|
|
25
|
-
Finance & Business 12% █████████
|
|
26
|
-
Technology 15% ███████████
|
|
27
|
-
Home & Garden 10% ████████
|
|
28
|
-
Food & Cooking 10% ████████
|
|
29
|
-
Travel & Geography 10% ████████
|
|
30
|
-
Hobbies & Crafts 10% ████████
|
|
31
|
-
Education & Learning 8% ██████
|
|
32
|
-
Arts & Culture 8% ██████
|
|
33
|
-
Lifestyle & Relationships 5% ████
|
|
34
|
-
────────────────────────────────────────
|
|
35
|
-
Short queries (1-2 words): 20%
|
|
36
|
-
Temporal (2025/2026): 15%
|
|
37
|
-
Named entities: 10%+
|
|
38
|
-
```
|
|
39
|
-
|
|
40
|
-
## Key Improvements
|
|
41
|
-
|
|
42
|
-
### 1. Category Diversity
|
|
43
|
-
|
|
44
|
-
**New Non-Tech Categories Added:**
|
|
45
|
-
- **Health & Wellness**: Meditation, fitness, nutrition, mental health
|
|
46
|
-
- **Finance & Business**: Budgeting, investing, career, entrepreneurship
|
|
47
|
-
- **Home & Garden**: DIY, repairs, cleaning, gardening, organization
|
|
48
|
-
- **Food & Cooking**: Recipes, techniques, meal planning, nutrition
|
|
49
|
-
- **Travel & Geography**: Travel planning, destinations, geography facts
|
|
50
|
-
- **Hobbies & Crafts**: Photography, art, music, woodworking, knitting
|
|
51
|
-
- **Education & Learning**: Study techniques, languages, online courses
|
|
52
|
-
- **Arts & Culture**: Art history, music, film, theater, literature
|
|
53
|
-
- **Lifestyle & Relationships**: Habits, relationships, parenting, minimalism
|
|
54
|
-
|
|
55
|
-
### 2. Temporal Queries (2025/2026)
|
|
56
|
-
|
|
57
|
-
Updated to use current era years for recency queries:
|
|
58
|
-
- "latest research 2026"
|
|
59
|
-
- "Shopify updates 2025"
|
|
60
|
-
- "what changed in React 2026"
|
|
61
|
-
- "AI developments 2025"
|
|
62
|
-
|
|
63
|
-
This ensures the model learns to handle queries from the current time period.
|
|
64
|
-
|
|
65
|
-
### 3. Short Query Coverage
|
|
66
|
-
|
|
67
|
-
Expanded from 47 to 144+ short keywords across all categories:
|
|
68
|
-
- Tech: auth, config, api, cache, deploy
|
|
69
|
-
- Health: meditate, hydrate, stretch, exercise
|
|
70
|
-
- Finance: budget, save, invest, taxes
|
|
71
|
-
- Home: clean, organize, repair, garden
|
|
72
|
-
- Food: cook, bake, recipe, meal
|
|
73
|
-
- Travel: travel, pack, passport, hotel
|
|
74
|
-
- Hobbies: photo, draw, paint, knit, guitar
|
|
75
|
-
- Education: study, learn, course, exam
|
|
76
|
-
- Arts: art, music, film, dance
|
|
77
|
-
- Life: habit, routine, organize, parent
|
|
78
|
-
|
|
79
|
-
## Usage
|
|
80
|
-
|
|
81
|
-
### Quick Start - Use Balanced Data
|
|
82
|
-
|
|
83
|
-
```bash
|
|
84
|
-
cd finetune
|
|
85
|
-
|
|
86
|
-
# Add 500 balanced examples
|
|
87
|
-
cat data/qmd_expansion_balanced.jsonl >> data/qmd_expansion_v2.jsonl
|
|
88
|
-
|
|
89
|
-
# Prepare with enhanced short query templates
|
|
90
|
-
uv run dataset/prepare_data.py --add-short 2
|
|
91
|
-
|
|
92
|
-
# Train
|
|
93
|
-
uv run train.py sft --config configs/sft.yaml
|
|
94
|
-
```
|
|
95
|
-
|
|
96
|
-
### Generate Fresh Data with Claude API
|
|
97
|
-
|
|
98
|
-
```bash
|
|
99
|
-
# Set API key
|
|
100
|
-
export ANTHROPIC_API_KEY=your_key
|
|
101
|
-
|
|
102
|
-
# Generate 300 balanced examples
|
|
103
|
-
uv run dataset/generate_data.py --count 300 \
|
|
104
|
-
--output data/qmd_expansion_fresh.jsonl
|
|
105
|
-
|
|
106
|
-
# Analyze distribution
|
|
107
|
-
uv run dataset/analyze_data.py --input data/qmd_expansion_fresh.jsonl
|
|
108
|
-
|
|
109
|
-
# Prepare for training
|
|
110
|
-
uv run dataset/prepare_data.py --input data/qmd_expansion_fresh.jsonl
|
|
111
|
-
```
|
|
112
|
-
|
|
113
|
-
### Generate Even More Balanced Examples
|
|
114
|
-
|
|
115
|
-
```bash
|
|
116
|
-
# Generate 500 life-focused examples (15% tech)
|
|
117
|
-
uv run dataset/generate_balanced.py
|
|
118
|
-
|
|
119
|
-
# Or generate 265 additional diverse examples
|
|
120
|
-
uv run dataset/generate_diverse.py
|
|
121
|
-
```
|
|
122
|
-
|
|
123
|
-
## File Summary
|
|
124
|
-
|
|
125
|
-
### Modified Files:
|
|
126
|
-
- `dataset/generate_data.py` - Added category weights (15% tech), 2025/2026 dates
|
|
127
|
-
- `dataset/prepare_data.py` - Expanded SHORT_QUERIES from 47→144, templates 5→16
|
|
128
|
-
|
|
129
|
-
### New Files:
|
|
130
|
-
- `dataset/generate_balanced.py` - Life-focused generator (500 examples)
|
|
131
|
-
- `dataset/generate_diverse.py` - Philosophy/History/Geography/Trivia generator (265 examples)
|
|
132
|
-
- `dataset/analyze_data.py` - Dataset analysis and quality reporting
|
|
133
|
-
- `DATA_IMPROVEMENTS.md` - Detailed improvement documentation
|
|
134
|
-
|
|
135
|
-
### Generated Data:
|
|
136
|
-
- `data/qmd_expansion_balanced.jsonl` - 500 balanced examples
|
|
137
|
-
- `data/qmd_expansion_diverse_addon.jsonl` - 265 diverse examples
|
|
138
|
-
|
|
139
|
-
## Expected Benefits
|
|
140
|
-
|
|
141
|
-
1. **Better Short Query Handling**: 20% coverage vs 10% before
|
|
142
|
-
2. **Named Entity Preservation**: 10%+ coverage vs 3.4% before
|
|
143
|
-
3. **Temporal Understanding**: 15% with 2025/2026 vs 1.6% before
|
|
144
|
-
4. **Domain Diversity**: 10 categories vs tech-only before
|
|
145
|
-
5. **Life-Document Search**: Better at searching personal notes on health, finance, hobbies
|
|
146
|
-
|
|
147
|
-
## Next Steps
|
|
148
|
-
|
|
149
|
-
1. Merge balanced examples into training set
|
|
150
|
-
2. Retrain model with improved distribution
|
|
151
|
-
3. Evaluate using `evals/queries.txt`
|
|
152
|
-
4. Monitor scores on temporal/named-entity/short queries
|
|
153
|
-
5. Iterate based on results
|
|
154
|
-
|
|
155
|
-
---
|
|
156
|
-
|
|
157
|
-
Generated: 2026-01-30
|