semantic-code-mcp 2.0.1 → 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +122 -102
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -47,13 +47,13 @@ That's it. Your AI assistant now has semantic code search.
47
47
 
48
48
  ### Multi-Provider Embeddings
49
49
 
50
- | Provider | Model | Privacy | Speed |
51
- |----------|-------|---------|-------|
52
- | **Local** (default) | nomic-embed-text-v1.5 | 100% local | ~50ms/chunk |
53
- | **Gemini** | gemini-embedding-001 | API call | Fast, batched |
54
- | **OpenAI** | text-embedding-3-small | API call | Fast |
55
- | **OpenAI-compatible** | Any compatible endpoint | Varies | Varies |
56
- | **Vertex AI** | Google Cloud models | GCP | Fast |
50
+ | Provider | Model | Privacy | Speed |
51
+ | --------------------- | ----------------------- | ---------- | ------------- |
52
+ | **Local** (default) | nomic-embed-text-v1.5 | 100% local | ~50ms/chunk |
53
+ | **Gemini** | gemini-embedding-001 | API call | Fast, batched |
54
+ | **OpenAI** | text-embedding-3-small | API call | Fast |
55
+ | **OpenAI-compatible** | Any compatible endpoint | Varies | Varies |
56
+ | **Vertex AI** | Google Cloud models | GCP | Fast |
57
57
 
58
58
  ### Flexible Vector Storage
59
59
 
@@ -74,26 +74,26 @@ CPU capped at 50% during indexing. Your machine stays responsive.
74
74
 
75
75
  ## Tools
76
76
 
77
- | Tool | Description |
78
- |------|-------------|
79
- | `a_semantic_search` | Find code by meaning. Hybrid semantic + exact match scoring. |
80
- | `b_index_codebase` | Trigger manual reindex (normally automatic & incremental). |
81
- | `c_clear_cache` | Reset embeddings cache entirely. |
82
- | `d_check_last_version` | Look up latest package version from 20+ registries. |
83
- | `e_set_workspace` | Switch project at runtime without restart. |
84
- | `f_get_status` | Server health: version, index progress, config. |
77
+ | Tool | Description |
78
+ | ---------------------- | ------------------------------------------------------------ |
79
+ | `a_semantic_search` | Find code by meaning. Hybrid semantic + exact match scoring. |
80
+ | `b_index_codebase` | Trigger manual reindex (normally automatic & incremental). |
81
+ | `c_clear_cache` | Reset embeddings cache entirely. |
82
+ | `d_check_last_version` | Look up latest package version from 20+ registries. |
83
+ | `e_set_workspace` | Switch project at runtime without restart. |
84
+ | `f_get_status` | Server health: version, index progress, config. |
85
85
 
86
86
  ## IDE Setup
87
87
 
88
- | IDE / App | Guide | `${workspaceFolder}` |
89
- |-----------|-------|----------------------|
90
- | **VS Code** | [Setup](docs/ide-setup/vscode.md) | ✅ |
91
- | **Cursor** | [Setup](docs/ide-setup/cursor.md) | ✅ |
92
- | **Windsurf** | [Setup](docs/ide-setup/windsurf.md) | ❌ |
93
- | **Claude Desktop** | [Setup](docs/ide-setup/claude-desktop.md) | ❌ |
94
- | **OpenCode** | [Setup](docs/ide-setup/opencode.md) | ❌ |
95
- | **Raycast** | [Setup](docs/ide-setup/raycast.md) | ❌ |
96
- | **Antigravity** | [Setup](docs/ide-setup/antigravity.md) | ❌ |
88
+ | IDE / App | Guide | `${workspaceFolder}` |
89
+ | ------------------ | ----------------------------------------- | -------------------- |
90
+ | **VS Code** | [Setup](docs/ide-setup/vscode.md) | ✅ |
91
+ | **Cursor** | [Setup](docs/ide-setup/cursor.md) | ✅ |
92
+ | **Windsurf** | [Setup](docs/ide-setup/windsurf.md) | ❌ |
93
+ | **Claude Desktop** | [Setup](docs/ide-setup/claude-desktop.md) | ❌ |
94
+ | **OpenCode** | [Setup](docs/ide-setup/opencode.md) | ❌ |
95
+ | **Raycast** | [Setup](docs/ide-setup/raycast.md) | ❌ |
96
+ | **Antigravity** | [Setup](docs/ide-setup/antigravity.md) | ❌ |
97
97
 
98
98
  ### Multi-Project
99
99
 
@@ -118,67 +118,67 @@ All settings via environment variables. Prefix: `SMART_CODING_`.
118
118
 
119
119
  ### Core
120
120
 
121
- | Variable | Default | Description |
122
- |----------|---------|-------------|
123
- | `SMART_CODING_VERBOSE` | `false` | Detailed logging |
124
- | `SMART_CODING_MAX_RESULTS` | `5` | Search results returned |
125
- | `SMART_CODING_BATCH_SIZE` | `100` | Files per parallel batch |
126
- | `SMART_CODING_MAX_FILE_SIZE` | `1048576` | Max file size (1MB) |
127
- | `SMART_CODING_CHUNK_SIZE` | `25` | Lines per chunk |
128
- | `SMART_CODING_CHUNKING_MODE` | `smart` | `smart` / `ast` / `line` |
129
- | `SMART_CODING_WATCH_FILES` | `false` | Auto-reindex on changes |
130
- | `SMART_CODING_AUTO_INDEX_DELAY` | `5000` | Background index delay (ms) |
131
- | `SMART_CODING_MAX_CPU_PERCENT` | `50` | CPU cap during indexing |
121
+ | Variable | Default | Description |
122
+ | ------------------------------- | --------- | --------------------------- |
123
+ | `SMART_CODING_VERBOSE` | `false` | Detailed logging |
124
+ | `SMART_CODING_MAX_RESULTS` | `5` | Search results returned |
125
+ | `SMART_CODING_BATCH_SIZE` | `100` | Files per parallel batch |
126
+ | `SMART_CODING_MAX_FILE_SIZE` | `1048576` | Max file size (1MB) |
127
+ | `SMART_CODING_CHUNK_SIZE` | `25` | Lines per chunk |
128
+ | `SMART_CODING_CHUNKING_MODE` | `smart` | `smart` / `ast` / `line` |
129
+ | `SMART_CODING_WATCH_FILES` | `false` | Auto-reindex on changes |
130
+ | `SMART_CODING_AUTO_INDEX_DELAY` | `5000` | Background index delay (ms) |
131
+ | `SMART_CODING_MAX_CPU_PERCENT` | `50` | CPU cap during indexing |
132
132
 
133
133
  ### Embedding Provider
134
134
 
135
- | Variable | Default | Description |
136
- |----------|---------|-------------|
137
- | `SMART_CODING_EMBEDDING_PROVIDER` | `local` | `local` / `gemini` / `openai` / `openai-compatible` / `vertex` |
138
- | `SMART_CODING_EMBEDDING_MODEL` | `nomic-ai/nomic-embed-text-v1.5` | Model name |
139
- | `SMART_CODING_EMBEDDING_DIMENSION` | `128` | MRL dimension (64–768) |
140
- | `SMART_CODING_DEVICE` | `auto` | `cpu` / `webgpu` / `auto` |
135
+ | Variable | Default | Description |
136
+ | ---------------------------------- | -------------------------------- | -------------------------------------------------------------- |
137
+ | `SMART_CODING_EMBEDDING_PROVIDER` | `local` | `local` / `gemini` / `openai` / `openai-compatible` / `vertex` |
138
+ | `SMART_CODING_EMBEDDING_MODEL` | `nomic-ai/nomic-embed-text-v1.5` | Model name |
139
+ | `SMART_CODING_EMBEDDING_DIMENSION` | `128` | MRL dimension (64–768) |
140
+ | `SMART_CODING_DEVICE` | `auto` | `cpu` / `webgpu` / `auto` |
141
141
 
142
142
  ### Gemini
143
143
 
144
- | Variable | Default | Description |
145
- |----------|---------|-------------|
146
- | `SMART_CODING_GEMINI_API_KEY` | — | API key |
147
- | `SMART_CODING_GEMINI_MODEL` | `gemini-embedding-001` | Model |
148
- | `SMART_CODING_GEMINI_DIMENSIONS` | `768` | Output dimensions |
149
- | `SMART_CODING_GEMINI_BATCH_SIZE` | `24` | Micro-batch size |
150
- | `SMART_CODING_GEMINI_MAX_RETRIES` | `3` | Retry count |
144
+ | Variable | Default | Description |
145
+ | --------------------------------- | ---------------------- | ----------------- |
146
+ | `SMART_CODING_GEMINI_API_KEY` | — | API key |
147
+ | `SMART_CODING_GEMINI_MODEL` | `gemini-embedding-001` | Model |
148
+ | `SMART_CODING_GEMINI_DIMENSIONS` | `768` | Output dimensions |
149
+ | `SMART_CODING_GEMINI_BATCH_SIZE` | `24` | Micro-batch size |
150
+ | `SMART_CODING_GEMINI_MAX_RETRIES` | `3` | Retry count |
151
151
 
152
152
  ### OpenAI / Compatible
153
153
 
154
- | Variable | Default | Description |
155
- |----------|---------|-------------|
156
- | `SMART_CODING_EMBEDDING_API_KEY` | — | API key |
157
- | `SMART_CODING_EMBEDDING_BASE_URL` | — | Base URL (compatible only) |
154
+ | Variable | Default | Description |
155
+ | --------------------------------- | ------- | -------------------------- |
156
+ | `SMART_CODING_EMBEDDING_API_KEY` | — | API key |
157
+ | `SMART_CODING_EMBEDDING_BASE_URL` | — | Base URL (compatible only) |
158
158
 
159
159
  ### Vertex AI
160
160
 
161
- | Variable | Default | Description |
162
- |----------|---------|-------------|
163
- | `SMART_CODING_VERTEX_PROJECT` | — | GCP project ID |
164
- | `SMART_CODING_VERTEX_LOCATION` | `us-central1` | Region |
161
+ | Variable | Default | Description |
162
+ | ------------------------------ | ------------- | -------------- |
163
+ | `SMART_CODING_VERTEX_PROJECT` | — | GCP project ID |
164
+ | `SMART_CODING_VERTEX_LOCATION` | `us-central1` | Region |
165
165
 
166
166
  ### Vector Store
167
167
 
168
- | Variable | Default | Description |
169
- |----------|---------|-------------|
170
- | `SMART_CODING_VECTOR_STORE_PROVIDER` | `sqlite` | `sqlite` / `milvus` |
171
- | `SMART_CODING_MILVUS_ADDRESS` | — | Milvus endpoint |
172
- | `SMART_CODING_MILVUS_TOKEN` | — | Auth token |
173
- | `SMART_CODING_MILVUS_DATABASE` | `default` | Database name |
174
- | `SMART_CODING_MILVUS_COLLECTION` | `smart_coding_embeddings` | Collection |
168
+ | Variable | Default | Description |
169
+ | ------------------------------------ | ------------------------- | ------------------- |
170
+ | `SMART_CODING_VECTOR_STORE_PROVIDER` | `sqlite` | `sqlite` / `milvus` |
171
+ | `SMART_CODING_MILVUS_ADDRESS` | — | Milvus endpoint |
172
+ | `SMART_CODING_MILVUS_TOKEN` | — | Auth token |
173
+ | `SMART_CODING_MILVUS_DATABASE` | `default` | Database name |
174
+ | `SMART_CODING_MILVUS_COLLECTION` | `smart_coding_embeddings` | Collection |
175
175
 
176
176
  ### Search Tuning
177
177
 
178
- | Variable | Default | Description |
179
- |----------|---------|-------------|
180
- | `SMART_CODING_SEMANTIC_WEIGHT` | `0.7` | Semantic vs exact weight |
181
- | `SMART_CODING_EXACT_MATCH_BOOST` | `1.5` | Exact match multiplier |
178
+ | Variable | Default | Description |
179
+ | -------------------------------- | ------- | ------------------------ |
180
+ | `SMART_CODING_SEMANTIC_WEIGHT` | `0.7` | Semantic vs exact weight |
181
+ | `SMART_CODING_EXACT_MATCH_BOOST` | `1.5` | Exact match multiplier |
182
182
 
183
183
  ### Example with Gemini + Milvus
184
184
 
@@ -201,44 +201,64 @@ All settings via environment variables. Prefix: `SMART_CODING_`.
201
201
 
202
202
  ## Architecture
203
203
 
204
- ```
205
- semantic-code-mcp/
206
- ├── index.js # MCP server entry point
207
- ├── lib/
208
- │ ├── config.js # Configuration loader
209
- │ ├── cache-factory.js # SQLite / Milvus provider selection
210
- │ ├── cache.js # SQLite vector store
211
- │ ├── milvus-cache.js # Milvus vector store
212
- │ ├── mrl-embedder.js # Local MRL embedder
213
- │ ├── gemini-embedder.js# Gemini API embedder
214
- │ ├── ast-chunker.js # Tree-sitter AST chunking
215
- │ ├── tokenizer.js # Token counting
216
- │ └── utils.js # Cosine similarity, hashing, smart chunking
217
- ├── features/
218
- │ ├── hybrid-search.js # Semantic + exact match search
219
- │ ├── index-codebase.js # File discovery & incremental indexing
220
- │ ├── clear-cache.js # Cache reset
221
- │ ├── check-last-version.js # Package version lookup
222
- │ ├── set-workspace.js # Runtime workspace switching
223
- │ └── get-status.js # Server status
224
- └── test/ # Vitest test suite
204
+ ```mermaid
205
+ graph TB
206
+ subgraph MCP["MCP Server (index.js)"]
207
+ direction TB
208
+ CFG["config.js<br/>Configuration"]
209
+ end
210
+
211
+ subgraph Features
212
+ SEARCH["hybrid-search.js<br/>Semantic + Exact Match"]
213
+ INDEX["index-codebase.js<br/>File Discovery & Indexing"]
214
+ STATUS["get-status.js<br/>Server Health"]
215
+ WORKSPACE["set-workspace.js<br/>Runtime Switching"]
216
+ VERSION["check-last-version.js<br/>Registry Lookup"]
217
+ CLEAR["clear-cache.js<br/>Cache Reset"]
218
+ end
219
+
220
+ subgraph Embeddings["Embedding Providers"]
221
+ LOCAL["mrl-embedder.js<br/>nomic-embed-text v1.5"]
222
+ GEMINI["gemini-embedder.js<br/>Gemini / Vertex AI"]
223
+ OAI["OpenAI / Compatible"]
224
+ end
225
+
226
+ subgraph Storage["Vector Storage"]
227
+ SQLITE["cache.js<br/>SQLite (default)"]
228
+ MILVUS["milvus-cache.js<br/>Milvus ANN"]
229
+ FACTORY["cache-factory.js<br/>Provider Selection"]
230
+ end
231
+
232
+ subgraph Chunking["Code Chunking"]
233
+ AST["ast-chunker.js<br/>Tree-sitter AST"]
234
+ SMART["utils.js<br/>Smart Regex"]
235
+ end
236
+
237
+ MCP --> Features
238
+ INDEX --> Chunking --> Embeddings --> FACTORY
239
+ FACTORY --> SQLITE
240
+ FACTORY --> MILVUS
241
+ SEARCH --> Embeddings
242
+ SEARCH --> FACTORY
225
243
  ```
226
244
 
227
245
  ## How It Works
228
246
 
229
- ```
230
- Your code files
231
- glob + .gitignore-aware discovery
232
- Smart/AST chunking
233
- language-aware splitting
234
- AI embedding (local or API)
235
- ↓ vector generation
236
- SQLite or Milvus storage
237
- incremental, hash-based updates
238
-
239
- Search query
240
- embed query → cosine similarity → exact match boost
241
- Top N results with relevance scores
247
+ ```mermaid
248
+ flowchart LR
249
+ A["📁 Source Files"] -->|glob + .gitignore| B["✂️ Smart/AST\nChunking"]
250
+ B -->|language-aware| C["🧠 AI Embedding\n(Local or API)"]
251
+ C -->|vectors| D["💾 SQLite / Milvus\nStorage"]
252
+ D -->|incremental hash| D
253
+
254
+ E["🔍 Search Query"] -->|embed| C
255
+ C -->|cosine similarity| F["📊 Hybrid Scoring\nsemantic + exact match"]
256
+ F --> G["🎯 Top N Results\nwith relevance scores"]
257
+
258
+ style A fill:#2d3748,color:#e2e8f0
259
+ style C fill:#553c9a,color:#e9d8fd
260
+ style D fill:#2a4365,color:#bee3f8
261
+ style G fill:#22543d,color:#c6f6d5
242
262
  ```
243
263
 
244
264
  **Progressive indexing** — search works immediately while indexing continues in the background. Only changed files are re-indexed on subsequent runs.
@@ -258,4 +278,4 @@ See [LICENSE](LICENSE) for full text.
258
278
 
259
279
  ---
260
280
 
261
- *Built on [smart-coding-mcp](https://github.com/omarHaris/smart-coding-mcp) by Omar Haris. Extended with multi-provider embeddings, Milvus ANN search, AST chunking, resource throttling, and comprehensive test suite.*
281
+ *Forked from [smart-coding-mcp](https://github.com/omarHaris/smart-coding-mcp) by Omar Haris. Extended with multi-provider embeddings (Gemini, Vertex AI, OpenAI), Milvus ANN search, AST chunking, resource throttling, and comprehensive test suite.*
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "semantic-code-mcp",
3
- "version": "2.0.1",
3
+ "version": "2.1.0",
4
4
  "description": "AI-powered semantic code search for coding agents. MCP server with multi-provider embeddings and hybrid search.",
5
5
  "type": "module",
6
6
  "main": "index.js",