semantic-code-mcp 2.0.0 → 2.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +134 -112
- package/package.json +4 -4
package/README.md
CHANGED
|
@@ -23,35 +23,37 @@ Based on [Cursor's research](https://cursor.com/blog/semsearch) showing semantic
|
|
|
23
23
|
## Quick Start
|
|
24
24
|
|
|
25
25
|
```bash
|
|
26
|
-
|
|
26
|
+
npx -y semantic-code-mcp@latest --workspace /path/to/your/project
|
|
27
27
|
```
|
|
28
28
|
|
|
29
|
-
|
|
29
|
+
Recommended MCP config (portable, no local script dependency):
|
|
30
30
|
|
|
31
31
|
```json
|
|
32
32
|
{
|
|
33
33
|
"mcpServers": {
|
|
34
34
|
"semantic-code-mcp": {
|
|
35
|
-
"command": "
|
|
36
|
-
"args": ["--workspace", "/path/to/your/project"]
|
|
35
|
+
"command": "npx",
|
|
36
|
+
"args": ["-y", "semantic-code-mcp@latest", "--workspace", "/path/to/your/project"]
|
|
37
37
|
}
|
|
38
38
|
}
|
|
39
39
|
}
|
|
40
40
|
```
|
|
41
41
|
|
|
42
|
+
Do not use machine-specific script paths such as `~/.codex/bin/start-smart-coding-mcp.sh` in shared documentation.
|
|
43
|
+
|
|
42
44
|
That's it. Your AI assistant now has semantic code search.
|
|
43
45
|
|
|
44
46
|
## Features
|
|
45
47
|
|
|
46
48
|
### Multi-Provider Embeddings
|
|
47
49
|
|
|
48
|
-
| Provider
|
|
49
|
-
|
|
50
|
-
| **Local** (default)
|
|
51
|
-
| **Gemini**
|
|
52
|
-
| **OpenAI**
|
|
53
|
-
| **OpenAI-compatible** | Any compatible endpoint | Varies
|
|
54
|
-
| **Vertex AI**
|
|
50
|
+
| Provider | Model | Privacy | Speed |
|
|
51
|
+
| --------------------- | ----------------------- | ---------- | ------------- |
|
|
52
|
+
| **Local** (default) | nomic-embed-text-v1.5 | 100% local | ~50ms/chunk |
|
|
53
|
+
| **Gemini** | gemini-embedding-001 | API call | Fast, batched |
|
|
54
|
+
| **OpenAI** | text-embedding-3-small | API call | Fast |
|
|
55
|
+
| **OpenAI-compatible** | Any compatible endpoint | Varies | Varies |
|
|
56
|
+
| **Vertex AI** | Google Cloud models | GCP | Fast |
|
|
55
57
|
|
|
56
58
|
### Flexible Vector Storage
|
|
57
59
|
|
|
@@ -72,26 +74,26 @@ CPU capped at 50% during indexing. Your machine stays responsive.
|
|
|
72
74
|
|
|
73
75
|
## Tools
|
|
74
76
|
|
|
75
|
-
| Tool
|
|
76
|
-
|
|
77
|
-
| `a_semantic_search`
|
|
78
|
-
| `b_index_codebase`
|
|
79
|
-
| `c_clear_cache`
|
|
80
|
-
| `d_check_last_version` | Look up latest package version from 20+ registries.
|
|
81
|
-
| `e_set_workspace`
|
|
82
|
-
| `f_get_status`
|
|
77
|
+
| Tool | Description |
|
|
78
|
+
| ---------------------- | ------------------------------------------------------------ |
|
|
79
|
+
| `a_semantic_search` | Find code by meaning. Hybrid semantic + exact match scoring. |
|
|
80
|
+
| `b_index_codebase` | Trigger manual reindex (normally automatic & incremental). |
|
|
81
|
+
| `c_clear_cache` | Reset embeddings cache entirely. |
|
|
82
|
+
| `d_check_last_version` | Look up latest package version from 20+ registries. |
|
|
83
|
+
| `e_set_workspace` | Switch project at runtime without restart. |
|
|
84
|
+
| `f_get_status` | Server health: version, index progress, config. |
|
|
83
85
|
|
|
84
86
|
## IDE Setup
|
|
85
87
|
|
|
86
|
-
| IDE / App
|
|
87
|
-
|
|
88
|
-
| **VS Code**
|
|
89
|
-
| **Cursor**
|
|
90
|
-
| **Windsurf**
|
|
91
|
-
| **Claude Desktop** | [Setup](docs/ide-setup/claude-desktop.md) | ❌
|
|
92
|
-
| **OpenCode**
|
|
93
|
-
| **Raycast**
|
|
94
|
-
| **Antigravity**
|
|
88
|
+
| IDE / App | Guide | `${workspaceFolder}` |
|
|
89
|
+
| ------------------ | ----------------------------------------- | -------------------- |
|
|
90
|
+
| **VS Code** | [Setup](docs/ide-setup/vscode.md) | ✅ |
|
|
91
|
+
| **Cursor** | [Setup](docs/ide-setup/cursor.md) | ✅ |
|
|
92
|
+
| **Windsurf** | [Setup](docs/ide-setup/windsurf.md) | ❌ |
|
|
93
|
+
| **Claude Desktop** | [Setup](docs/ide-setup/claude-desktop.md) | ❌ |
|
|
94
|
+
| **OpenCode** | [Setup](docs/ide-setup/opencode.md) | ❌ |
|
|
95
|
+
| **Raycast** | [Setup](docs/ide-setup/raycast.md) | ❌ |
|
|
96
|
+
| **Antigravity** | [Setup](docs/ide-setup/antigravity.md) | ❌ |
|
|
95
97
|
|
|
96
98
|
### Multi-Project
|
|
97
99
|
|
|
@@ -99,12 +101,12 @@ CPU capped at 50% during indexing. Your machine stays responsive.
|
|
|
99
101
|
{
|
|
100
102
|
"mcpServers": {
|
|
101
103
|
"code-frontend": {
|
|
102
|
-
"command": "
|
|
103
|
-
"args": ["--workspace", "/path/to/frontend"]
|
|
104
|
+
"command": "npx",
|
|
105
|
+
"args": ["-y", "semantic-code-mcp@latest", "--workspace", "/path/to/frontend"]
|
|
104
106
|
},
|
|
105
107
|
"code-backend": {
|
|
106
|
-
"command": "
|
|
107
|
-
"args": ["--workspace", "/path/to/backend"]
|
|
108
|
+
"command": "npx",
|
|
109
|
+
"args": ["-y", "semantic-code-mcp@latest", "--workspace", "/path/to/backend"]
|
|
108
110
|
}
|
|
109
111
|
}
|
|
110
112
|
}
|
|
@@ -116,67 +118,67 @@ All settings via environment variables. Prefix: `SMART_CODING_`.
|
|
|
116
118
|
|
|
117
119
|
### Core
|
|
118
120
|
|
|
119
|
-
| Variable
|
|
120
|
-
|
|
121
|
-
| `SMART_CODING_VERBOSE`
|
|
122
|
-
| `SMART_CODING_MAX_RESULTS`
|
|
123
|
-
| `SMART_CODING_BATCH_SIZE`
|
|
124
|
-
| `SMART_CODING_MAX_FILE_SIZE`
|
|
125
|
-
| `SMART_CODING_CHUNK_SIZE`
|
|
126
|
-
| `SMART_CODING_CHUNKING_MODE`
|
|
127
|
-
| `SMART_CODING_WATCH_FILES`
|
|
128
|
-
| `SMART_CODING_AUTO_INDEX_DELAY` | `5000`
|
|
129
|
-
| `SMART_CODING_MAX_CPU_PERCENT`
|
|
121
|
+
| Variable | Default | Description |
|
|
122
|
+
| ------------------------------- | --------- | --------------------------- |
|
|
123
|
+
| `SMART_CODING_VERBOSE` | `false` | Detailed logging |
|
|
124
|
+
| `SMART_CODING_MAX_RESULTS` | `5` | Search results returned |
|
|
125
|
+
| `SMART_CODING_BATCH_SIZE` | `100` | Files per parallel batch |
|
|
126
|
+
| `SMART_CODING_MAX_FILE_SIZE` | `1048576` | Max file size (1MB) |
|
|
127
|
+
| `SMART_CODING_CHUNK_SIZE` | `25` | Lines per chunk |
|
|
128
|
+
| `SMART_CODING_CHUNKING_MODE` | `smart` | `smart` / `ast` / `line` |
|
|
129
|
+
| `SMART_CODING_WATCH_FILES` | `false` | Auto-reindex on changes |
|
|
130
|
+
| `SMART_CODING_AUTO_INDEX_DELAY` | `5000` | Background index delay (ms) |
|
|
131
|
+
| `SMART_CODING_MAX_CPU_PERCENT` | `50` | CPU cap during indexing |
|
|
130
132
|
|
|
131
133
|
### Embedding Provider
|
|
132
134
|
|
|
133
|
-
| Variable
|
|
134
|
-
|
|
135
|
-
| `SMART_CODING_EMBEDDING_PROVIDER`
|
|
136
|
-
| `SMART_CODING_EMBEDDING_MODEL`
|
|
137
|
-
| `SMART_CODING_EMBEDDING_DIMENSION` | `128`
|
|
138
|
-
| `SMART_CODING_DEVICE`
|
|
135
|
+
| Variable | Default | Description |
|
|
136
|
+
| ---------------------------------- | -------------------------------- | -------------------------------------------------------------- |
|
|
137
|
+
| `SMART_CODING_EMBEDDING_PROVIDER` | `local` | `local` / `gemini` / `openai` / `openai-compatible` / `vertex` |
|
|
138
|
+
| `SMART_CODING_EMBEDDING_MODEL` | `nomic-ai/nomic-embed-text-v1.5` | Model name |
|
|
139
|
+
| `SMART_CODING_EMBEDDING_DIMENSION` | `128` | MRL dimension (64–768) |
|
|
140
|
+
| `SMART_CODING_DEVICE` | `auto` | `cpu` / `webgpu` / `auto` |
|
|
139
141
|
|
|
140
142
|
### Gemini
|
|
141
143
|
|
|
142
|
-
| Variable
|
|
143
|
-
|
|
144
|
-
| `SMART_CODING_GEMINI_API_KEY`
|
|
145
|
-
| `SMART_CODING_GEMINI_MODEL`
|
|
146
|
-
| `SMART_CODING_GEMINI_DIMENSIONS`
|
|
147
|
-
| `SMART_CODING_GEMINI_BATCH_SIZE`
|
|
148
|
-
| `SMART_CODING_GEMINI_MAX_RETRIES` | `3`
|
|
144
|
+
| Variable | Default | Description |
|
|
145
|
+
| --------------------------------- | ---------------------- | ----------------- |
|
|
146
|
+
| `SMART_CODING_GEMINI_API_KEY` | — | API key |
|
|
147
|
+
| `SMART_CODING_GEMINI_MODEL` | `gemini-embedding-001` | Model |
|
|
148
|
+
| `SMART_CODING_GEMINI_DIMENSIONS` | `768` | Output dimensions |
|
|
149
|
+
| `SMART_CODING_GEMINI_BATCH_SIZE` | `24` | Micro-batch size |
|
|
150
|
+
| `SMART_CODING_GEMINI_MAX_RETRIES` | `3` | Retry count |
|
|
149
151
|
|
|
150
152
|
### OpenAI / Compatible
|
|
151
153
|
|
|
152
|
-
| Variable
|
|
153
|
-
|
|
154
|
-
| `SMART_CODING_EMBEDDING_API_KEY`
|
|
155
|
-
| `SMART_CODING_EMBEDDING_BASE_URL` | —
|
|
154
|
+
| Variable | Default | Description |
|
|
155
|
+
| --------------------------------- | ------- | -------------------------- |
|
|
156
|
+
| `SMART_CODING_EMBEDDING_API_KEY` | — | API key |
|
|
157
|
+
| `SMART_CODING_EMBEDDING_BASE_URL` | — | Base URL (compatible only) |
|
|
156
158
|
|
|
157
159
|
### Vertex AI
|
|
158
160
|
|
|
159
|
-
| Variable
|
|
160
|
-
|
|
161
|
-
| `SMART_CODING_VERTEX_PROJECT`
|
|
162
|
-
| `SMART_CODING_VERTEX_LOCATION` | `us-central1` | Region
|
|
161
|
+
| Variable | Default | Description |
|
|
162
|
+
| ------------------------------ | ------------- | -------------- |
|
|
163
|
+
| `SMART_CODING_VERTEX_PROJECT` | — | GCP project ID |
|
|
164
|
+
| `SMART_CODING_VERTEX_LOCATION` | `us-central1` | Region |
|
|
163
165
|
|
|
164
166
|
### Vector Store
|
|
165
167
|
|
|
166
|
-
| Variable
|
|
167
|
-
|
|
168
|
-
| `SMART_CODING_VECTOR_STORE_PROVIDER` | `sqlite`
|
|
169
|
-
| `SMART_CODING_MILVUS_ADDRESS`
|
|
170
|
-
| `SMART_CODING_MILVUS_TOKEN`
|
|
171
|
-
| `SMART_CODING_MILVUS_DATABASE`
|
|
172
|
-
| `SMART_CODING_MILVUS_COLLECTION`
|
|
168
|
+
| Variable | Default | Description |
|
|
169
|
+
| ------------------------------------ | ------------------------- | ------------------- |
|
|
170
|
+
| `SMART_CODING_VECTOR_STORE_PROVIDER` | `sqlite` | `sqlite` / `milvus` |
|
|
171
|
+
| `SMART_CODING_MILVUS_ADDRESS` | — | Milvus endpoint |
|
|
172
|
+
| `SMART_CODING_MILVUS_TOKEN` | — | Auth token |
|
|
173
|
+
| `SMART_CODING_MILVUS_DATABASE` | `default` | Database name |
|
|
174
|
+
| `SMART_CODING_MILVUS_COLLECTION` | `smart_coding_embeddings` | Collection |
|
|
173
175
|
|
|
174
176
|
### Search Tuning
|
|
175
177
|
|
|
176
|
-
| Variable
|
|
177
|
-
|
|
178
|
-
| `SMART_CODING_SEMANTIC_WEIGHT`
|
|
179
|
-
| `SMART_CODING_EXACT_MATCH_BOOST` | `1.5`
|
|
178
|
+
| Variable | Default | Description |
|
|
179
|
+
| -------------------------------- | ------- | ------------------------ |
|
|
180
|
+
| `SMART_CODING_SEMANTIC_WEIGHT` | `0.7` | Semantic vs exact weight |
|
|
181
|
+
| `SMART_CODING_EXACT_MATCH_BOOST` | `1.5` | Exact match multiplier |
|
|
180
182
|
|
|
181
183
|
### Example with Gemini + Milvus
|
|
182
184
|
|
|
@@ -184,8 +186,8 @@ All settings via environment variables. Prefix: `SMART_CODING_`.
|
|
|
184
186
|
{
|
|
185
187
|
"mcpServers": {
|
|
186
188
|
"semantic-code-mcp": {
|
|
187
|
-
"command": "
|
|
188
|
-
"args": ["--workspace", "/path/to/project"],
|
|
189
|
+
"command": "npx",
|
|
190
|
+
"args": ["-y", "semantic-code-mcp@latest", "--workspace", "/path/to/project"],
|
|
189
191
|
"env": {
|
|
190
192
|
"SMART_CODING_EMBEDDING_PROVIDER": "gemini",
|
|
191
193
|
"SMART_CODING_GEMINI_API_KEY": "YOUR_KEY",
|
|
@@ -199,44 +201,64 @@ All settings via environment variables. Prefix: `SMART_CODING_`.
|
|
|
199
201
|
|
|
200
202
|
## Architecture
|
|
201
203
|
|
|
202
|
-
```
|
|
203
|
-
|
|
204
|
-
|
|
205
|
-
|
|
206
|
-
|
|
207
|
-
|
|
208
|
-
|
|
209
|
-
|
|
210
|
-
|
|
211
|
-
|
|
212
|
-
|
|
213
|
-
|
|
214
|
-
|
|
215
|
-
|
|
216
|
-
|
|
217
|
-
|
|
218
|
-
|
|
219
|
-
|
|
220
|
-
|
|
221
|
-
|
|
222
|
-
|
|
204
|
+
```mermaid
|
|
205
|
+
graph TB
|
|
206
|
+
subgraph MCP["MCP Server (index.js)"]
|
|
207
|
+
direction TB
|
|
208
|
+
CFG["config.js<br/>Configuration"]
|
|
209
|
+
end
|
|
210
|
+
|
|
211
|
+
subgraph Features
|
|
212
|
+
SEARCH["hybrid-search.js<br/>Semantic + Exact Match"]
|
|
213
|
+
INDEX["index-codebase.js<br/>File Discovery & Indexing"]
|
|
214
|
+
STATUS["get-status.js<br/>Server Health"]
|
|
215
|
+
WORKSPACE["set-workspace.js<br/>Runtime Switching"]
|
|
216
|
+
VERSION["check-last-version.js<br/>Registry Lookup"]
|
|
217
|
+
CLEAR["clear-cache.js<br/>Cache Reset"]
|
|
218
|
+
end
|
|
219
|
+
|
|
220
|
+
subgraph Embeddings["Embedding Providers"]
|
|
221
|
+
LOCAL["mrl-embedder.js<br/>nomic-embed-text v1.5"]
|
|
222
|
+
GEMINI["gemini-embedder.js<br/>Gemini / Vertex AI"]
|
|
223
|
+
OAI["OpenAI / Compatible"]
|
|
224
|
+
end
|
|
225
|
+
|
|
226
|
+
subgraph Storage["Vector Storage"]
|
|
227
|
+
SQLITE["cache.js<br/>SQLite (default)"]
|
|
228
|
+
MILVUS["milvus-cache.js<br/>Milvus ANN"]
|
|
229
|
+
FACTORY["cache-factory.js<br/>Provider Selection"]
|
|
230
|
+
end
|
|
231
|
+
|
|
232
|
+
subgraph Chunking["Code Chunking"]
|
|
233
|
+
AST["ast-chunker.js<br/>Tree-sitter AST"]
|
|
234
|
+
SMART["utils.js<br/>Smart Regex"]
|
|
235
|
+
end
|
|
236
|
+
|
|
237
|
+
MCP --> Features
|
|
238
|
+
INDEX --> Chunking --> Embeddings --> FACTORY
|
|
239
|
+
FACTORY --> SQLITE
|
|
240
|
+
FACTORY --> MILVUS
|
|
241
|
+
SEARCH --> Embeddings
|
|
242
|
+
SEARCH --> FACTORY
|
|
223
243
|
```
|
|
224
244
|
|
|
225
245
|
## How It Works
|
|
226
246
|
|
|
227
|
-
```
|
|
228
|
-
|
|
229
|
-
|
|
230
|
-
|
|
231
|
-
|
|
232
|
-
|
|
233
|
-
|
|
234
|
-
|
|
235
|
-
|
|
236
|
-
|
|
237
|
-
|
|
238
|
-
|
|
239
|
-
|
|
247
|
+
```mermaid
|
|
248
|
+
flowchart LR
|
|
249
|
+
A["📁 Source Files"] -->|glob + .gitignore| B["✂️ Smart/AST\nChunking"]
|
|
250
|
+
B -->|language-aware| C["🧠 AI Embedding\n(Local or API)"]
|
|
251
|
+
C -->|vectors| D["💾 SQLite / Milvus\nStorage"]
|
|
252
|
+
D -->|incremental hash| D
|
|
253
|
+
|
|
254
|
+
E["🔍 Search Query"] -->|embed| C
|
|
255
|
+
C -->|cosine similarity| F["📊 Hybrid Scoring\nsemantic + exact match"]
|
|
256
|
+
F --> G["🎯 Top N Results\nwith relevance scores"]
|
|
257
|
+
|
|
258
|
+
style A fill:#2d3748,color:#e2e8f0
|
|
259
|
+
style C fill:#553c9a,color:#e9d8fd
|
|
260
|
+
style D fill:#2a4365,color:#bee3f8
|
|
261
|
+
style G fill:#22543d,color:#c6f6d5
|
|
240
262
|
```
|
|
241
263
|
|
|
242
264
|
**Progressive indexing** — search works immediately while indexing continues in the background. Only changed files are re-indexed on subsequent runs.
|
|
@@ -256,4 +278,4 @@ See [LICENSE](LICENSE) for full text.
|
|
|
256
278
|
|
|
257
279
|
---
|
|
258
280
|
|
|
259
|
-
*
|
|
281
|
+
*Forked from [smart-coding-mcp](https://github.com/omarHaris/smart-coding-mcp) by Omar Haris. Extended with multi-provider embeddings (Gemini, Vertex AI, OpenAI), Milvus ANN search, AST chunking, resource throttling, and comprehensive test suite.*
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "semantic-code-mcp",
|
|
3
|
-
"version": "2.
|
|
3
|
+
"version": "2.1.0",
|
|
4
4
|
"description": "AI-powered semantic code search for coding agents. MCP server with multi-provider embeddings and hybrid search.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"main": "index.js",
|
|
@@ -42,12 +42,12 @@
|
|
|
42
42
|
],
|
|
43
43
|
"repository": {
|
|
44
44
|
"type": "git",
|
|
45
|
-
"url": "https://github.com/bitkyc08-arch/
|
|
45
|
+
"url": "https://github.com/bitkyc08-arch/semantic-code-mcp.git"
|
|
46
46
|
},
|
|
47
47
|
"bugs": {
|
|
48
|
-
"url": "https://github.com/bitkyc08-arch/
|
|
48
|
+
"url": "https://github.com/bitkyc08-arch/semantic-code-mcp/issues"
|
|
49
49
|
},
|
|
50
|
-
"homepage": "https://github.com/bitkyc08-arch/
|
|
50
|
+
"homepage": "https://github.com/bitkyc08-arch/semantic-code-mcp#readme",
|
|
51
51
|
"license": "MIT",
|
|
52
52
|
"dependencies": {
|
|
53
53
|
"@huggingface/transformers": "^3.8.1",
|