mcp-probe-kit 3.0.18 → 3.0.21
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +87 -55
- package/build/index.js +3 -1
- package/build/lib/__tests__/agents-md-template.unit.test.d.ts +1 -0
- package/build/lib/__tests__/agents-md-template.unit.test.js +27 -0
- package/build/lib/__tests__/memory-config.unit.test.js +9 -0
- package/build/lib/__tests__/memory-injection.unit.test.d.ts +1 -0
- package/build/lib/__tests__/memory-injection.unit.test.js +51 -0
- package/build/lib/__tests__/memory-orchestration.unit.test.d.ts +1 -0
- package/build/lib/__tests__/memory-orchestration.unit.test.js +84 -0
- package/build/lib/__tests__/memory-payload.unit.test.d.ts +1 -0
- package/build/lib/__tests__/memory-payload.unit.test.js +35 -0
- package/build/lib/__tests__/project-context-layout.unit.test.d.ts +1 -0
- package/build/lib/__tests__/project-context-layout.unit.test.js +80 -0
- package/build/lib/agents-md-template.d.ts +25 -0
- package/build/lib/agents-md-template.js +57 -0
- package/build/lib/memory-client.d.ts +8 -1
- package/build/lib/memory-client.js +53 -44
- package/build/lib/memory-config.d.ts +8 -0
- package/build/lib/memory-config.js +19 -0
- package/build/lib/memory-orchestration.d.ts +10 -3
- package/build/lib/memory-orchestration.js +146 -7
- package/build/lib/memory-payload.d.ts +21 -0
- package/build/lib/memory-payload.js +65 -0
- package/build/lib/merge-agents-md.d.ts +6 -0
- package/build/lib/merge-agents-md.js +51 -0
- package/build/lib/project-context-layout.d.ts +78 -0
- package/build/lib/project-context-layout.js +350 -0
- package/build/lib/workspace-root.js +6 -1
- package/build/resources/ui-ux-data/metadata.json +1 -1
- package/build/schemas/index.d.ts +62 -11
- package/build/schemas/memory-tools.d.ts +38 -9
- package/build/schemas/memory-tools.js +24 -9
- package/build/schemas/project-tools.d.ts +24 -2
- package/build/schemas/project-tools.js +24 -2
- package/build/tools/__tests__/code_insight.unit.test.js +3 -3
- package/build/tools/__tests__/init_project_context.unit.test.js +32 -21
- package/build/tools/__tests__/start_feature.unit.test.js +2 -1
- package/build/tools/code_insight.js +11 -9
- package/build/tools/index.d.ts +1 -0
- package/build/tools/index.js +1 -0
- package/build/tools/init_project_context.js +563 -506
- package/build/tools/memorize_asset.js +12 -0
- package/build/tools/scan_and_extract_patterns.js +7 -7
- package/build/tools/search_memory.d.ts +7 -0
- package/build/tools/search_memory.js +57 -0
- package/build/tools/start_bugfix.js +257 -251
- package/build/tools/start_feature.js +140 -134
- package/build/tools/start_ui.js +405 -405
- package/docs/.mcp-probe/layout.json +11 -0
- package/docs/data/tools.js +18 -0
- package/docs/i18n/all-tools/en.json +6 -1
- package/docs/i18n/all-tools/ja.json +2 -1
- package/docs/i18n/all-tools/ko.json +2 -1
- package/docs/i18n/all-tools/zh-CN.json +7 -2
- package/docs/i18n/en.json +38 -7
- package/docs/i18n/ja.json +9 -2
- package/docs/i18n/ko.json +9 -2
- package/docs/i18n/zh-CN.json +40 -9
- package/docs/memory-local-setup.md +314 -0
- package/docs/memory-local-setup.zh-CN.md +283 -0
- package/docs/pages/getting-started.html +252 -33
- package/package.json +2 -2
|
@@ -0,0 +1,314 @@
|
|
|
1
|
+
# Local Memory Stack (Qdrant + Nomic Embed)
|
|
2
|
+
|
|
3
|
+
This guide documents a **lightweight local setup** for mcp-probe-kit memory tools (`search_memory`, `memorize_asset`, `read_memory_asset`, `scan_and_extract_patterns`):
|
|
4
|
+
|
|
5
|
+
- **Qdrant** — vector database (port `50008`)
|
|
6
|
+
- **Infinity (nomic-embed)** — embedding API (port `50012`), replaces Ollama for users who want a smaller footprint
|
|
7
|
+
|
|
8
|
+
Both services are typically deployed with Docker Compose under a shared `docker-start` layout. Ports use the `500xx` convention to avoid conflicts.
|
|
9
|
+
|
|
10
|
+
| Service | Host port | Container port | Purpose |
|
|
11
|
+
|---------|-----------|----------------|---------|
|
|
12
|
+
| Qdrant HTTP | `50008` | `6333` | REST API, dashboard |
|
|
13
|
+
| Qdrant gRPC | `50009` | `6334` | gRPC API |
|
|
14
|
+
| Nomic Embed (Infinity) | `50012` | `7997` | OpenAI-compatible embeddings |
|
|
15
|
+
|
|
16
|
+
---
|
|
17
|
+
|
|
18
|
+
## 1. Qdrant
|
|
19
|
+
|
|
20
|
+
### Service info
|
|
21
|
+
|
|
22
|
+
| Item | Value |
|
|
23
|
+
|------|-------|
|
|
24
|
+
| Image | `qdrant/qdrant:latest` |
|
|
25
|
+
| Container name | `qdrant` |
|
|
26
|
+
| HTTP | `http://127.0.0.1:50008` |
|
|
27
|
+
| gRPC | `127.0.0.1:50009` |
|
|
28
|
+
| Data | `./data` → `/qdrant/storage` |
|
|
29
|
+
| Snapshots | `./snapshots` → `/qdrant/snapshots` |
|
|
30
|
+
| Auth | `QDRANT_API_KEY` in `.env` (header `api-key`) |
|
|
31
|
+
|
|
32
|
+
### `docker-compose.yml`
|
|
33
|
+
|
|
34
|
+
```yaml
|
|
35
|
+
services:
|
|
36
|
+
qdrant:
|
|
37
|
+
image: qdrant/qdrant:latest
|
|
38
|
+
container_name: qdrant
|
|
39
|
+
restart: always
|
|
40
|
+
env_file:
|
|
41
|
+
- .env
|
|
42
|
+
ports:
|
|
43
|
+
- "50008:6333"
|
|
44
|
+
- "50009:6334"
|
|
45
|
+
volumes:
|
|
46
|
+
- ./data:/qdrant/storage
|
|
47
|
+
- ./snapshots:/qdrant/snapshots
|
|
48
|
+
environment:
|
|
49
|
+
- QDRANT__SERVICE__HTTP_PORT=6333
|
|
50
|
+
- QDRANT__SERVICE__GRPC_PORT=6334
|
|
51
|
+
- QDRANT__LOG_LEVEL=INFO
|
|
52
|
+
- QDRANT__SERVICE__API_KEY=${QDRANT_API_KEY}
|
|
53
|
+
healthcheck:
|
|
54
|
+
test:
|
|
55
|
+
- "CMD"
|
|
56
|
+
- "bash"
|
|
57
|
+
- "-c"
|
|
58
|
+
- "exec 3<>/dev/tcp/127.0.0.1/6333 && printf 'GET /collections HTTP/1.1\r\nHost: localhost\r\napi-key: ${QDRANT_API_KEY}\r\nConnection: close\r\n\r\n' >&3 && IFS= read -r line <&3 && [[ \"$$line\" == *\"200\"* ]]"
|
|
59
|
+
interval: 30s
|
|
60
|
+
timeout: 10s
|
|
61
|
+
retries: 5
|
|
62
|
+
start_period: 30s
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
### `.env.example`
|
|
66
|
+
|
|
67
|
+
```bash
|
|
68
|
+
# copy: cp .env.example .env
|
|
69
|
+
# generate key: python -c "import secrets; print(secrets.token_urlsafe(32))"
|
|
70
|
+
|
|
71
|
+
QDRANT_API_KEY=change-me-to-a-long-random-string
|
|
72
|
+
QDRANT_URL=http://127.0.0.1:50008
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
### First deploy
|
|
76
|
+
|
|
77
|
+
```powershell
|
|
78
|
+
cd qdrant
|
|
79
|
+
copy .env.example .env
|
|
80
|
+
# Edit .env — set a long random QDRANT_API_KEY
|
|
81
|
+
docker compose up -d
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
### Verify
|
|
85
|
+
|
|
86
|
+
```bash
|
|
87
|
+
# Health / collections (requires api-key when API_KEY is enabled)
|
|
88
|
+
curl http://127.0.0.1:50008/collections \
|
|
89
|
+
-H "api-key: YOUR_QDRANT_API_KEY"
|
|
90
|
+
|
|
91
|
+
# Web UI (enter the same key in the dashboard)
|
|
92
|
+
# http://127.0.0.1:50008/dashboard
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
### Common commands
|
|
96
|
+
|
|
97
|
+
```powershell
|
|
98
|
+
docker compose up -d # start
|
|
99
|
+
docker compose down # stop
|
|
100
|
+
docker compose logs -f qdrant # logs
|
|
101
|
+
docker compose restart qdrant # restart
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
### Notes
|
|
105
|
+
|
|
106
|
+
- After enabling `QDRANT__SERVICE__API_KEY`, **all** REST/gRPC requests must include header `api-key`.
|
|
107
|
+
- mcp-probe-kit sends this via `MEMORY_QDRANT_API_KEY`.
|
|
108
|
+
- Collection `mcp_probe_memory` is created automatically on first `memorize_asset` write (Cosine distance; vector size inferred from the first embedding).
|
|
109
|
+
|
|
110
|
+
---
|
|
111
|
+
|
|
112
|
+
## 2. Nomic Embed (Infinity)
|
|
113
|
+
|
|
114
|
+
Lightweight embedding server based on [Infinity](https://github.com/michaelfeil/infinity). Model: `nomic-ai/nomic-embed-text-v1.5` (768 dimensions). No Ollama required.
|
|
115
|
+
|
|
116
|
+
### Service info
|
|
117
|
+
|
|
118
|
+
| Item | Value |
|
|
119
|
+
|------|-------|
|
|
120
|
+
| Image | `michaelf34/infinity:0.0.70` |
|
|
121
|
+
| Container name | `nomic-embed` |
|
|
122
|
+
| Host port | `50012` → container `7997` |
|
|
123
|
+
| Model | `nomic-ai/nomic-embed-text-v1.5` |
|
|
124
|
+
| Vector dim | **768** |
|
|
125
|
+
| Engine | `torch` (CPU if no GPU) |
|
|
126
|
+
| Auth | `INFINITY_API_KEY` → `Authorization: Bearer <key>` |
|
|
127
|
+
| Model cache | Docker volume `hf_cache` → `/app/.cache` |
|
|
128
|
+
|
|
129
|
+
### `docker-compose.yml`
|
|
130
|
+
|
|
131
|
+
```yaml
|
|
132
|
+
services:
|
|
133
|
+
nomic-embed:
|
|
134
|
+
image: michaelf34/infinity:0.0.70
|
|
135
|
+
container_name: nomic-embed
|
|
136
|
+
restart: unless-stopped
|
|
137
|
+
ports:
|
|
138
|
+
- "50012:7997"
|
|
139
|
+
volumes:
|
|
140
|
+
- hf_cache:/app/.cache
|
|
141
|
+
environment:
|
|
142
|
+
INFINITY_API_KEY: ${INFINITY_API_KEY}
|
|
143
|
+
command:
|
|
144
|
+
- v2
|
|
145
|
+
- --model-id
|
|
146
|
+
- nomic-ai/nomic-embed-text-v1.5
|
|
147
|
+
- --revision
|
|
148
|
+
- main
|
|
149
|
+
- --dtype
|
|
150
|
+
- float32
|
|
151
|
+
- --batch-size
|
|
152
|
+
- "8"
|
|
153
|
+
- --engine
|
|
154
|
+
- torch
|
|
155
|
+
- --port
|
|
156
|
+
- "7997"
|
|
157
|
+
- --no-bettertransformer
|
|
158
|
+
healthcheck:
|
|
159
|
+
test:
|
|
160
|
+
- "CMD"
|
|
161
|
+
- "curl"
|
|
162
|
+
- "-f"
|
|
163
|
+
- "http://127.0.0.1:7997/health"
|
|
164
|
+
interval: 30s
|
|
165
|
+
timeout: 10s
|
|
166
|
+
retries: 5
|
|
167
|
+
start_period: 120s
|
|
168
|
+
|
|
169
|
+
volumes:
|
|
170
|
+
hf_cache:
|
|
171
|
+
```
|
|
172
|
+
|
|
173
|
+
### `.env.example`
|
|
174
|
+
|
|
175
|
+
```bash
|
|
176
|
+
INFINITY_API_KEY=change-me-to-a-long-random-string
|
|
177
|
+
```
|
|
178
|
+
|
|
179
|
+
### First deploy
|
|
180
|
+
|
|
181
|
+
```powershell
|
|
182
|
+
cd nomic-embed
|
|
183
|
+
copy .env.example .env
|
|
184
|
+
# Edit .env — set INFINITY_API_KEY (long random string)
|
|
185
|
+
docker compose up -d
|
|
186
|
+
docker logs -f nomic-embed # wait for "ready to batch requests"
|
|
187
|
+
```
|
|
188
|
+
|
|
189
|
+
First start downloads the HuggingFace model (**~2–5 minutes** cold start).
|
|
190
|
+
|
|
191
|
+
### Verify
|
|
192
|
+
|
|
193
|
+
```bash
|
|
194
|
+
curl http://127.0.0.1:50012/health
|
|
195
|
+
|
|
196
|
+
curl http://127.0.0.1:50012/models \
|
|
197
|
+
-H "Authorization: Bearer YOUR_INFINITY_API_KEY"
|
|
198
|
+
|
|
199
|
+
# Important: path is /embeddings — NOT /v1/embeddings
|
|
200
|
+
curl http://127.0.0.1:50012/embeddings \
|
|
201
|
+
-H "Authorization: Bearer YOUR_INFINITY_API_KEY" \
|
|
202
|
+
-H "Content-Type: application/json" \
|
|
203
|
+
-d '{"model":"nomic-ai/nomic-embed-text-v1.5","input":"hello world"}'
|
|
204
|
+
```
|
|
205
|
+
|
|
206
|
+
Response: `data[0].embedding` is a **768**-float array.
|
|
207
|
+
|
|
208
|
+
Swagger: `http://127.0.0.1:50012/docs`
|
|
209
|
+
|
|
210
|
+
### Performance (CPU, indicative)
|
|
211
|
+
|
|
212
|
+
| Scenario | ~Latency |
|
|
213
|
+
|----------|----------|
|
|
214
|
+
| Single short text (warm) | 30–50 ms |
|
|
215
|
+
| First request | ~150 ms |
|
|
216
|
+
| Batch 8 | ~150 ms |
|
|
217
|
+
| Resident memory | ~1 GB |
|
|
218
|
+
|
|
219
|
+
Suitable for MCP memory and occasional writes; not for high-concurrency bulk indexing.
|
|
220
|
+
|
|
221
|
+
---
|
|
222
|
+
|
|
223
|
+
## 3. mcp-probe-kit MCP configuration
|
|
224
|
+
|
|
225
|
+
**Recommended:** Qdrant on `50008` + Infinity on `50012` with `openai-compatible` provider.
|
|
226
|
+
|
|
227
|
+
```json
|
|
228
|
+
{
|
|
229
|
+
"mcpServers": {
|
|
230
|
+
"mcp-probe-kit": {
|
|
231
|
+
"command": "npx",
|
|
232
|
+
"args": ["-y", "mcp-probe-kit@latest"],
|
|
233
|
+
"env": {
|
|
234
|
+
"MEMORY_QDRANT_URL": "http://127.0.0.1:50008",
|
|
235
|
+
"MEMORY_QDRANT_API_KEY": "YOUR_QDRANT_API_KEY",
|
|
236
|
+
"MEMORY_QDRANT_COLLECTION": "mcp_probe_memory",
|
|
237
|
+
"MEMORY_EMBEDDING_PROVIDER": "openai-compatible",
|
|
238
|
+
"MEMORY_EMBEDDING_URL": "http://127.0.0.1:50012/embeddings",
|
|
239
|
+
"MEMORY_EMBEDDING_MODEL": "nomic-ai/nomic-embed-text-v1.5",
|
|
240
|
+
"MEMORY_EMBEDDING_API_KEY": "YOUR_INFINITY_API_KEY",
|
|
241
|
+
"MEMORY_SEARCH_LIMIT": "3",
|
|
242
|
+
"MEMORY_SUMMARY_MAX_CHARS": "280"
|
|
243
|
+
}
|
|
244
|
+
}
|
|
245
|
+
}
|
|
246
|
+
}
|
|
247
|
+
```
|
|
248
|
+
|
|
249
|
+
Claude Code: put the same keys under `mcpServers.mcp-probe-kit.env` in `.mcp.json`.
|
|
250
|
+
|
|
251
|
+
After changing env, **fully restart** your MCP client (e.g. quit and reopen Cursor).
|
|
252
|
+
|
|
253
|
+
### Environment variable reference
|
|
254
|
+
|
|
255
|
+
| Variable | Required | Description |
|
|
256
|
+
|----------|----------|-------------|
|
|
257
|
+
| `MEMORY_QDRANT_URL` | Yes (read/write) | Qdrant base URL, e.g. `http://127.0.0.1:50008` |
|
|
258
|
+
| `MEMORY_QDRANT_API_KEY` | If Qdrant auth enabled | Sent as `api-key` header |
|
|
259
|
+
| `MEMORY_QDRANT_COLLECTION` | No | Default `mcp_probe_memory` |
|
|
260
|
+
| `MEMORY_EMBEDDING_URL` | Yes (write/search) | e.g. `http://127.0.0.1:50012/embeddings` |
|
|
261
|
+
| `MEMORY_EMBEDDING_MODEL` | Yes (write/search) | `nomic-ai/nomic-embed-text-v1.5` |
|
|
262
|
+
| `MEMORY_EMBEDDING_PROVIDER` | No | Must be `openai-compatible` for Infinity |
|
|
263
|
+
| `MEMORY_EMBEDDING_API_KEY` | Yes for Infinity | Bearer token = `INFINITY_API_KEY` |
|
|
264
|
+
| `MEMORY_SEARCH_LIMIT` | No | Default `3` |
|
|
265
|
+
| `MEMORY_SUMMARY_MAX_CHARS` | No | Default `280` |
|
|
266
|
+
|
|
267
|
+
---
|
|
268
|
+
|
|
269
|
+
## 4. End-to-end smoke test
|
|
270
|
+
|
|
271
|
+
```bash
|
|
272
|
+
# 1) Qdrant
|
|
273
|
+
curl -s http://127.0.0.1:50008/collections -H "api-key: YOUR_QDRANT_API_KEY"
|
|
274
|
+
|
|
275
|
+
# 2) Embedding
|
|
276
|
+
curl -s -X POST http://127.0.0.1:50012/embeddings \
|
|
277
|
+
-H "Authorization: Bearer YOUR_INFINITY_API_KEY" \
|
|
278
|
+
-H "Content-Type: application/json" \
|
|
279
|
+
-d '{"model":"nomic-ai/nomic-embed-text-v1.5","input":"mcp-probe-kit test"}' \
|
|
280
|
+
| jq '.data[0].embedding | length'
|
|
281
|
+
# Expected: 768
|
|
282
|
+
```
|
|
283
|
+
|
|
284
|
+
Then in the IDE, call `memorize_asset` once and `read_memory_asset` / semantic search via orchestration tools.
|
|
285
|
+
|
|
286
|
+
---
|
|
287
|
+
|
|
288
|
+
## 5. Troubleshooting
|
|
289
|
+
|
|
290
|
+
| Symptom | Fix |
|
|
291
|
+
|---------|-----|
|
|
292
|
+
| Qdrant `401` | Set `MEMORY_QDRANT_API_KEY` to match `qdrant/.env` |
|
|
293
|
+
| Embedding `401` | Use `Authorization: Bearer` + correct `INFINITY_API_KEY` |
|
|
294
|
+
| Embedding `404` | URL must be `http://127.0.0.1:50012/embeddings`, not `/v1/embeddings` |
|
|
295
|
+
| `nomic-embed` health stuck on `starting` | First model download; check `docker logs nomic-embed` |
|
|
296
|
+
| Log `No CUDA runtime` | Normal on CPU |
|
|
297
|
+
| Dimension mismatch in Qdrant | Collection was created with another model; delete collection or use a new `MEMORY_QDRANT_COLLECTION` name |
|
|
298
|
+
| Memory write disabled | Ensure all three are set: `MEMORY_QDRANT_URL`, `MEMORY_EMBEDDING_URL`, `MEMORY_EMBEDDING_MODEL` |
|
|
299
|
+
|
|
300
|
+
---
|
|
301
|
+
|
|
302
|
+
## 6. Alternatives
|
|
303
|
+
|
|
304
|
+
| Stack | When to use |
|
|
305
|
+
|-------|-------------|
|
|
306
|
+
| **Qdrant + Infinity (this guide)** | Default for local dev; lighter than Ollama |
|
|
307
|
+
| Qdrant + Ollama | If you already run Ollama for chat models |
|
|
308
|
+
| Qdrant + hosted OpenAI-compatible API | No local embedding container |
|
|
309
|
+
|
|
310
|
+
See also [README — Optional Memory System Setup](../README.md#optional-memory-system-setup).
|
|
311
|
+
|
|
312
|
+
---
|
|
313
|
+
|
|
314
|
+
**中文说明**: 同内容中文版见 [memory-local-setup.zh-CN.md](./memory-local-setup.zh-CN.md).
|
|
@@ -0,0 +1,283 @@
|
|
|
1
|
+
# 本地记忆栈(Qdrant + Nomic Embed)
|
|
2
|
+
|
|
3
|
+
面向 `search_memory`、`memorize_asset`、`read_memory_asset`、`scan_and_extract_patterns` 的**轻量本机部署**说明:
|
|
4
|
+
|
|
5
|
+
- **Qdrant** — 向量库(端口 `50008`)
|
|
6
|
+
- **Infinity(nomic-embed)** — 向量生成(端口 `50012`),**替代 Ollama**,对用户更轻
|
|
7
|
+
|
|
8
|
+
建议使用 Docker Compose 统一部署;端口采用 `500xx` 段,避免与其它服务冲突。
|
|
9
|
+
|
|
10
|
+
| 服务 | 宿主机端口 | 容器端口 | 说明 |
|
|
11
|
+
|------|------------|----------|------|
|
|
12
|
+
| Qdrant HTTP | `50008` | `6333` | REST、Dashboard |
|
|
13
|
+
| Qdrant gRPC | `50009` | `6334` | gRPC |
|
|
14
|
+
| Nomic Embed | `50012` | `7997` | OpenAI 兼容 embedding |
|
|
15
|
+
|
|
16
|
+
---
|
|
17
|
+
|
|
18
|
+
## 1. Qdrant 向量数据库
|
|
19
|
+
|
|
20
|
+
### 服务信息
|
|
21
|
+
|
|
22
|
+
| 项 | 值 |
|
|
23
|
+
|----|-----|
|
|
24
|
+
| **镜像** | `qdrant/qdrant:latest` |
|
|
25
|
+
| **容器名** | `qdrant` |
|
|
26
|
+
| **HTTP** | `http://127.0.0.1:50008` |
|
|
27
|
+
| **gRPC** | `127.0.0.1:50009` |
|
|
28
|
+
| **数据** | `./data` → `/qdrant/storage` |
|
|
29
|
+
| **快照** | `./snapshots` → `/qdrant/snapshots` |
|
|
30
|
+
| **认证** | `.env` 中 `QDRANT_API_KEY`,请求头 `api-key` |
|
|
31
|
+
|
|
32
|
+
### `docker-compose.yml`
|
|
33
|
+
|
|
34
|
+
```yaml
|
|
35
|
+
services:
|
|
36
|
+
qdrant:
|
|
37
|
+
image: qdrant/qdrant:latest
|
|
38
|
+
container_name: qdrant
|
|
39
|
+
restart: always
|
|
40
|
+
env_file:
|
|
41
|
+
- .env
|
|
42
|
+
ports:
|
|
43
|
+
- "50008:6333"
|
|
44
|
+
- "50009:6334"
|
|
45
|
+
volumes:
|
|
46
|
+
- ./data:/qdrant/storage
|
|
47
|
+
- ./snapshots:/qdrant/snapshots
|
|
48
|
+
environment:
|
|
49
|
+
- QDRANT__SERVICE__HTTP_PORT=6333
|
|
50
|
+
- QDRANT__SERVICE__GRPC_PORT=6334
|
|
51
|
+
- QDRANT__LOG_LEVEL=INFO
|
|
52
|
+
- QDRANT__SERVICE__API_KEY=${QDRANT_API_KEY}
|
|
53
|
+
healthcheck:
|
|
54
|
+
test:
|
|
55
|
+
- "CMD"
|
|
56
|
+
- "bash"
|
|
57
|
+
- "-c"
|
|
58
|
+
- "exec 3<>/dev/tcp/127.0.0.1/6333 && printf 'GET /collections HTTP/1.1\r\nHost: localhost\r\napi-key: ${QDRANT_API_KEY}\r\nConnection: close\r\n\r\n' >&3 && IFS= read -r line <&3 && [[ \"$$line\" == *\"200\"* ]]"
|
|
59
|
+
interval: 30s
|
|
60
|
+
timeout: 10s
|
|
61
|
+
retries: 5
|
|
62
|
+
start_period: 30s
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
### `.env.example`
|
|
66
|
+
|
|
67
|
+
```bash
|
|
68
|
+
# 复制:copy .env.example .env
|
|
69
|
+
# 生成密钥:python -c "import secrets; print(secrets.token_urlsafe(32))"
|
|
70
|
+
|
|
71
|
+
QDRANT_API_KEY=请改为长随机串
|
|
72
|
+
QDRANT_URL=http://127.0.0.1:50008
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
### 首次部署
|
|
76
|
+
|
|
77
|
+
```powershell
|
|
78
|
+
cd qdrant
|
|
79
|
+
copy .env.example .env
|
|
80
|
+
# 编辑 .env,设置 QDRANT_API_KEY
|
|
81
|
+
docker compose up -d
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
### 验证
|
|
85
|
+
|
|
86
|
+
```bash
|
|
87
|
+
curl http://127.0.0.1:50008/collections \
|
|
88
|
+
-H "api-key: 你的QDRANT_API_KEY"
|
|
89
|
+
|
|
90
|
+
# 面板(需填入相同 Key)
|
|
91
|
+
# http://127.0.0.1:50008/dashboard
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
### 常用命令
|
|
95
|
+
|
|
96
|
+
```powershell
|
|
97
|
+
docker compose up -d
|
|
98
|
+
docker compose down
|
|
99
|
+
docker compose logs -f qdrant
|
|
100
|
+
docker compose restart qdrant
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
### 说明
|
|
104
|
+
|
|
105
|
+
- 开启 `QDRANT__SERVICE__API_KEY` 后,所有 REST/gRPC 请求必须带 `api-key` 头。
|
|
106
|
+
- mcp-probe-kit 通过环境变量 `MEMORY_QDRANT_API_KEY` 传入。
|
|
107
|
+
- 首次 `memorize_asset` 写入会自动创建 collection `mcp_probe_memory`(Cosine;维度由首次 embedding 推断)。
|
|
108
|
+
|
|
109
|
+
---
|
|
110
|
+
|
|
111
|
+
## 2. Nomic Embed(Infinity 推理服务)
|
|
112
|
+
|
|
113
|
+
基于 [Infinity](https://github.com/michaelfeil/infinity),模型 `nomic-ai/nomic-embed-text-v1.5`,**768 维**,无需 Ollama。
|
|
114
|
+
|
|
115
|
+
### 服务信息
|
|
116
|
+
|
|
117
|
+
| 项 | 值 |
|
|
118
|
+
|----|-----|
|
|
119
|
+
| **镜像** | `michaelf34/infinity:0.0.70` |
|
|
120
|
+
| **容器名** | `nomic-embed` |
|
|
121
|
+
| **端口** | `50012` → `7997` |
|
|
122
|
+
| **模型** | `nomic-ai/nomic-embed-text-v1.5` |
|
|
123
|
+
| **向量维度** | 768 |
|
|
124
|
+
| **引擎** | `torch`(无 GPU 走 CPU) |
|
|
125
|
+
| **认证** | `INFINITY_API_KEY`(`Authorization: Bearer`) |
|
|
126
|
+
| **模型缓存** | 卷 `hf_cache` → `/app/.cache` |
|
|
127
|
+
|
|
128
|
+
### `docker-compose.yml`
|
|
129
|
+
|
|
130
|
+
```yaml
|
|
131
|
+
services:
|
|
132
|
+
nomic-embed:
|
|
133
|
+
image: michaelf34/infinity:0.0.70
|
|
134
|
+
container_name: nomic-embed
|
|
135
|
+
restart: unless-stopped
|
|
136
|
+
ports:
|
|
137
|
+
- "50012:7997"
|
|
138
|
+
volumes:
|
|
139
|
+
- hf_cache:/app/.cache
|
|
140
|
+
environment:
|
|
141
|
+
INFINITY_API_KEY: ${INFINITY_API_KEY}
|
|
142
|
+
command:
|
|
143
|
+
- v2
|
|
144
|
+
- --model-id
|
|
145
|
+
- nomic-ai/nomic-embed-text-v1.5
|
|
146
|
+
- --revision
|
|
147
|
+
- main
|
|
148
|
+
- --dtype
|
|
149
|
+
- float32
|
|
150
|
+
- --batch-size
|
|
151
|
+
- "8"
|
|
152
|
+
- --engine
|
|
153
|
+
- torch
|
|
154
|
+
- --port
|
|
155
|
+
- "7997"
|
|
156
|
+
- --no-bettertransformer
|
|
157
|
+
healthcheck:
|
|
158
|
+
test:
|
|
159
|
+
- "CMD"
|
|
160
|
+
- "curl"
|
|
161
|
+
- "-f"
|
|
162
|
+
- "http://127.0.0.1:7997/health"
|
|
163
|
+
interval: 30s
|
|
164
|
+
timeout: 10s
|
|
165
|
+
retries: 5
|
|
166
|
+
start_period: 120s
|
|
167
|
+
|
|
168
|
+
volumes:
|
|
169
|
+
hf_cache:
|
|
170
|
+
```
|
|
171
|
+
|
|
172
|
+
### 首次部署
|
|
173
|
+
|
|
174
|
+
```powershell
|
|
175
|
+
cd nomic-embed
|
|
176
|
+
copy .env.example .env
|
|
177
|
+
docker compose up -d
|
|
178
|
+
docker logs -f nomic-embed
|
|
179
|
+
```
|
|
180
|
+
|
|
181
|
+
冷启动需下载模型,约 **2–5 分钟**,日志出现 `ready to batch requests` 即就绪。
|
|
182
|
+
|
|
183
|
+
### 访问与验证
|
|
184
|
+
|
|
185
|
+
```bash
|
|
186
|
+
curl http://127.0.0.1:50012/health
|
|
187
|
+
|
|
188
|
+
curl http://127.0.0.1:50012/models \
|
|
189
|
+
-H "Authorization: Bearer 你的INFINITY_API_KEY"
|
|
190
|
+
|
|
191
|
+
# 注意:路径是 POST /embeddings,不是 /v1/embeddings
|
|
192
|
+
curl http://127.0.0.1:50012/embeddings \
|
|
193
|
+
-H "Authorization: Bearer 你的INFINITY_API_KEY" \
|
|
194
|
+
-H "Content-Type: application/json" \
|
|
195
|
+
-d '{"model":"nomic-ai/nomic-embed-text-v1.5","input":"hello world"}'
|
|
196
|
+
```
|
|
197
|
+
|
|
198
|
+
Swagger:`http://127.0.0.1:50012/docs`
|
|
199
|
+
|
|
200
|
+
### 性能(纯 CPU,参考)
|
|
201
|
+
|
|
202
|
+
| 场景 | 约耗时 |
|
|
203
|
+
|------|--------|
|
|
204
|
+
| 单条短文本(热) | 30–50 ms |
|
|
205
|
+
| 首条 | ~150 ms |
|
|
206
|
+
| batch 8 | ~150 ms |
|
|
207
|
+
| 常驻内存 | ~1 GB |
|
|
208
|
+
|
|
209
|
+
适合 MCP 记忆、低频写入;不适合高并发批量入库。
|
|
210
|
+
|
|
211
|
+
---
|
|
212
|
+
|
|
213
|
+
## 3. 与 mcp-probe-kit / Qdrant 配合
|
|
214
|
+
|
|
215
|
+
**推荐组合**:Qdrant `50008` + Infinity `50012`。
|
|
216
|
+
|
|
217
|
+
```json
|
|
218
|
+
{
|
|
219
|
+
"mcpServers": {
|
|
220
|
+
"mcp-probe-kit": {
|
|
221
|
+
"command": "npx",
|
|
222
|
+
"args": ["-y", "mcp-probe-kit@latest"],
|
|
223
|
+
"env": {
|
|
224
|
+
"MEMORY_QDRANT_URL": "http://127.0.0.1:50008",
|
|
225
|
+
"MEMORY_QDRANT_API_KEY": "与 qdrant/.env 中 QDRANT_API_KEY 相同",
|
|
226
|
+
"MEMORY_QDRANT_COLLECTION": "mcp_probe_memory",
|
|
227
|
+
"MEMORY_EMBEDDING_PROVIDER": "openai-compatible",
|
|
228
|
+
"MEMORY_EMBEDDING_URL": "http://127.0.0.1:50012/embeddings",
|
|
229
|
+
"MEMORY_EMBEDDING_MODEL": "nomic-ai/nomic-embed-text-v1.5",
|
|
230
|
+
"MEMORY_EMBEDDING_API_KEY": "与 nomic-embed/.env 中 INFINITY_API_KEY 相同",
|
|
231
|
+
"MEMORY_SEARCH_LIMIT": "3",
|
|
232
|
+
"MEMORY_SUMMARY_MAX_CHARS": "280",
|
|
233
|
+
"MEMORY_SEARCH_MIN_SCORE": "0",
|
|
234
|
+
"MEMORY_SEARCH_SHOW_SOURCE": "false",
|
|
235
|
+
"MEMORY_REPO_ID": ""
|
|
236
|
+
}
|
|
237
|
+
}
|
|
238
|
+
}
|
|
239
|
+
}
|
|
240
|
+
```
|
|
241
|
+
|
|
242
|
+
Claude Code:写入 `.mcp.json` 的 `mcpServers.mcp-probe-kit.env`。修改后**完全重启** Cursor。
|
|
243
|
+
|
|
244
|
+
---
|
|
245
|
+
|
|
246
|
+
## 4. 端到端自测
|
|
247
|
+
|
|
248
|
+
```bash
|
|
249
|
+
curl -s http://127.0.0.1:50008/collections -H "api-key: 你的QDRANT_API_KEY"
|
|
250
|
+
|
|
251
|
+
curl -s -X POST http://127.0.0.1:50012/embeddings \
|
|
252
|
+
-H "Authorization: Bearer 你的INFINITY_API_KEY" \
|
|
253
|
+
-H "Content-Type: application/json" \
|
|
254
|
+
-d '{"model":"nomic-ai/nomic-embed-text-v1.5","input":"测试"}' \
|
|
255
|
+
| jq '.data[0].embedding | length'
|
|
256
|
+
# 期望输出:768
|
|
257
|
+
```
|
|
258
|
+
|
|
259
|
+
---
|
|
260
|
+
|
|
261
|
+
## 5. 故障排查
|
|
262
|
+
|
|
263
|
+
| 现象 | 处理 |
|
|
264
|
+
|------|------|
|
|
265
|
+
| Qdrant `401` | 配置 `MEMORY_QDRANT_API_KEY`,与 `.env` 一致 |
|
|
266
|
+
| Embedding `401` | 检查 `Authorization: Bearer` 与 `INFINITY_API_KEY` |
|
|
267
|
+
| Embedding `404` | URL 必须是 `/embeddings`,勿用 `/v1/embeddings` |
|
|
268
|
+
| health 长期 `starting` | 首次下模型,看 `docker logs nomic-embed` |
|
|
269
|
+
| 日志 `No CUDA runtime` | 正常,表示 CPU 推理 |
|
|
270
|
+
| 向量维度不匹配 | 换模型后需新 collection 或删旧 collection |
|
|
271
|
+
| 记忆写入不可用 | 同时配置 `MEMORY_QDRANT_URL`、`MEMORY_EMBEDDING_URL`、`MEMORY_EMBEDDING_MODEL` |
|
|
272
|
+
|
|
273
|
+
---
|
|
274
|
+
|
|
275
|
+
## 6. 其它方案
|
|
276
|
+
|
|
277
|
+
| 方案 | 适用 |
|
|
278
|
+
|------|------|
|
|
279
|
+
| **本指南(Qdrant + Infinity)** | 本地开发默认,比 Ollama 轻 |
|
|
280
|
+
| Qdrant + Ollama | 已用 Ollama 跑聊天模型时 |
|
|
281
|
+
| Qdrant + 云端 OpenAI 兼容 API | 不想跑本地 embedding 容器 |
|
|
282
|
+
|
|
283
|
+
英文版:[memory-local-setup.md](./memory-local-setup.md)
|