semantic-code-mcp 2.0.0 → 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +134 -112
  2. package/package.json +4 -4
package/README.md CHANGED
@@ -23,35 +23,37 @@ Based on [Cursor's research](https://cursor.com/blog/semsearch) showing semantic
23
23
  ## Quick Start
24
24
 
25
25
  ```bash
26
- npm install -g semantic-code-mcp
26
+ npx -y semantic-code-mcp@latest --workspace /path/to/your/project
27
27
  ```
28
28
 
29
- Add to your MCP config:
29
+ Recommended MCP config (portable, no local script dependency):
30
30
 
31
31
  ```json
32
32
  {
33
33
  "mcpServers": {
34
34
  "semantic-code-mcp": {
35
- "command": "semantic-code-mcp",
36
- "args": ["--workspace", "/path/to/your/project"]
35
+ "command": "npx",
36
+ "args": ["-y", "semantic-code-mcp@latest", "--workspace", "/path/to/your/project"]
37
37
  }
38
38
  }
39
39
  }
40
40
  ```
41
41
 
42
+ Do not use machine-specific script paths such as `~/.codex/bin/start-smart-coding-mcp.sh` in shared documentation.
43
+
42
44
  That's it. Your AI assistant now has semantic code search.
43
45
 
44
46
  ## Features
45
47
 
46
48
  ### Multi-Provider Embeddings
47
49
 
48
- | Provider | Model | Privacy | Speed |
49
- |----------|-------|---------|-------|
50
- | **Local** (default) | nomic-embed-text-v1.5 | 100% local | ~50ms/chunk |
51
- | **Gemini** | gemini-embedding-001 | API call | Fast, batched |
52
- | **OpenAI** | text-embedding-3-small | API call | Fast |
53
- | **OpenAI-compatible** | Any compatible endpoint | Varies | Varies |
54
- | **Vertex AI** | Google Cloud models | GCP | Fast |
50
+ | Provider | Model | Privacy | Speed |
51
+ | --------------------- | ----------------------- | ---------- | ------------- |
52
+ | **Local** (default) | nomic-embed-text-v1.5 | 100% local | ~50ms/chunk |
53
+ | **Gemini** | gemini-embedding-001 | API call | Fast, batched |
54
+ | **OpenAI** | text-embedding-3-small | API call | Fast |
55
+ | **OpenAI-compatible** | Any compatible endpoint | Varies | Varies |
56
+ | **Vertex AI** | Google Cloud models | GCP | Fast |
55
57
 
56
58
  ### Flexible Vector Storage
57
59
 
@@ -72,26 +74,26 @@ CPU capped at 50% during indexing. Your machine stays responsive.
72
74
 
73
75
  ## Tools
74
76
 
75
- | Tool | Description |
76
- |------|-------------|
77
- | `a_semantic_search` | Find code by meaning. Hybrid semantic + exact match scoring. |
78
- | `b_index_codebase` | Trigger manual reindex (normally automatic & incremental). |
79
- | `c_clear_cache` | Reset embeddings cache entirely. |
80
- | `d_check_last_version` | Look up latest package version from 20+ registries. |
81
- | `e_set_workspace` | Switch project at runtime without restart. |
82
- | `f_get_status` | Server health: version, index progress, config. |
77
+ | Tool | Description |
78
+ | ---------------------- | ------------------------------------------------------------ |
79
+ | `a_semantic_search` | Find code by meaning. Hybrid semantic + exact match scoring. |
80
+ | `b_index_codebase` | Trigger manual reindex (normally automatic & incremental). |
81
+ | `c_clear_cache` | Reset embeddings cache entirely. |
82
+ | `d_check_last_version` | Look up latest package version from 20+ registries. |
83
+ | `e_set_workspace` | Switch project at runtime without restart. |
84
+ | `f_get_status` | Server health: version, index progress, config. |
83
85
 
84
86
  ## IDE Setup
85
87
 
86
- | IDE / App | Guide | `${workspaceFolder}` |
87
- |-----------|-------|----------------------|
88
- | **VS Code** | [Setup](docs/ide-setup/vscode.md) | ✅ |
89
- | **Cursor** | [Setup](docs/ide-setup/cursor.md) | ✅ |
90
- | **Windsurf** | [Setup](docs/ide-setup/windsurf.md) | ❌ |
91
- | **Claude Desktop** | [Setup](docs/ide-setup/claude-desktop.md) | ❌ |
92
- | **OpenCode** | [Setup](docs/ide-setup/opencode.md) | ❌ |
93
- | **Raycast** | [Setup](docs/ide-setup/raycast.md) | ❌ |
94
- | **Antigravity** | [Setup](docs/ide-setup/antigravity.md) | ❌ |
88
+ | IDE / App | Guide | `${workspaceFolder}` |
89
+ | ------------------ | ----------------------------------------- | -------------------- |
90
+ | **VS Code** | [Setup](docs/ide-setup/vscode.md) | ✅ |
91
+ | **Cursor** | [Setup](docs/ide-setup/cursor.md) | ✅ |
92
+ | **Windsurf** | [Setup](docs/ide-setup/windsurf.md) | ❌ |
93
+ | **Claude Desktop** | [Setup](docs/ide-setup/claude-desktop.md) | ❌ |
94
+ | **OpenCode** | [Setup](docs/ide-setup/opencode.md) | ❌ |
95
+ | **Raycast** | [Setup](docs/ide-setup/raycast.md) | ❌ |
96
+ | **Antigravity** | [Setup](docs/ide-setup/antigravity.md) | ❌ |
95
97
 
96
98
  ### Multi-Project
97
99
 
@@ -99,12 +101,12 @@ CPU capped at 50% during indexing. Your machine stays responsive.
99
101
  {
100
102
  "mcpServers": {
101
103
  "code-frontend": {
102
- "command": "semantic-code-mcp",
103
- "args": ["--workspace", "/path/to/frontend"]
104
+ "command": "npx",
105
+ "args": ["-y", "semantic-code-mcp@latest", "--workspace", "/path/to/frontend"]
104
106
  },
105
107
  "code-backend": {
106
- "command": "semantic-code-mcp",
107
- "args": ["--workspace", "/path/to/backend"]
108
+ "command": "npx",
109
+ "args": ["-y", "semantic-code-mcp@latest", "--workspace", "/path/to/backend"]
108
110
  }
109
111
  }
110
112
  }
@@ -116,67 +118,67 @@ All settings via environment variables. Prefix: `SMART_CODING_`.
116
118
 
117
119
  ### Core
118
120
 
119
- | Variable | Default | Description |
120
- |----------|---------|-------------|
121
- | `SMART_CODING_VERBOSE` | `false` | Detailed logging |
122
- | `SMART_CODING_MAX_RESULTS` | `5` | Search results returned |
123
- | `SMART_CODING_BATCH_SIZE` | `100` | Files per parallel batch |
124
- | `SMART_CODING_MAX_FILE_SIZE` | `1048576` | Max file size (1MB) |
125
- | `SMART_CODING_CHUNK_SIZE` | `25` | Lines per chunk |
126
- | `SMART_CODING_CHUNKING_MODE` | `smart` | `smart` / `ast` / `line` |
127
- | `SMART_CODING_WATCH_FILES` | `false` | Auto-reindex on changes |
128
- | `SMART_CODING_AUTO_INDEX_DELAY` | `5000` | Background index delay (ms) |
129
- | `SMART_CODING_MAX_CPU_PERCENT` | `50` | CPU cap during indexing |
121
+ | Variable | Default | Description |
122
+ | ------------------------------- | --------- | --------------------------- |
123
+ | `SMART_CODING_VERBOSE` | `false` | Detailed logging |
124
+ | `SMART_CODING_MAX_RESULTS` | `5` | Search results returned |
125
+ | `SMART_CODING_BATCH_SIZE` | `100` | Files per parallel batch |
126
+ | `SMART_CODING_MAX_FILE_SIZE` | `1048576` | Max file size (1MB) |
127
+ | `SMART_CODING_CHUNK_SIZE` | `25` | Lines per chunk |
128
+ | `SMART_CODING_CHUNKING_MODE` | `smart` | `smart` / `ast` / `line` |
129
+ | `SMART_CODING_WATCH_FILES` | `false` | Auto-reindex on changes |
130
+ | `SMART_CODING_AUTO_INDEX_DELAY` | `5000` | Background index delay (ms) |
131
+ | `SMART_CODING_MAX_CPU_PERCENT` | `50` | CPU cap during indexing |
130
132
 
131
133
  ### Embedding Provider
132
134
 
133
- | Variable | Default | Description |
134
- |----------|---------|-------------|
135
- | `SMART_CODING_EMBEDDING_PROVIDER` | `local` | `local` / `gemini` / `openai` / `openai-compatible` / `vertex` |
136
- | `SMART_CODING_EMBEDDING_MODEL` | `nomic-ai/nomic-embed-text-v1.5` | Model name |
137
- | `SMART_CODING_EMBEDDING_DIMENSION` | `128` | MRL dimension (64–768) |
138
- | `SMART_CODING_DEVICE` | `auto` | `cpu` / `webgpu` / `auto` |
135
+ | Variable | Default | Description |
136
+ | ---------------------------------- | -------------------------------- | -------------------------------------------------------------- |
137
+ | `SMART_CODING_EMBEDDING_PROVIDER` | `local` | `local` / `gemini` / `openai` / `openai-compatible` / `vertex` |
138
+ | `SMART_CODING_EMBEDDING_MODEL` | `nomic-ai/nomic-embed-text-v1.5` | Model name |
139
+ | `SMART_CODING_EMBEDDING_DIMENSION` | `128` | MRL dimension (64–768) |
140
+ | `SMART_CODING_DEVICE` | `auto` | `cpu` / `webgpu` / `auto` |
139
141
 
140
142
  ### Gemini
141
143
 
142
- | Variable | Default | Description |
143
- |----------|---------|-------------|
144
- | `SMART_CODING_GEMINI_API_KEY` | — | API key |
145
- | `SMART_CODING_GEMINI_MODEL` | `gemini-embedding-001` | Model |
146
- | `SMART_CODING_GEMINI_DIMENSIONS` | `768` | Output dimensions |
147
- | `SMART_CODING_GEMINI_BATCH_SIZE` | `24` | Micro-batch size |
148
- | `SMART_CODING_GEMINI_MAX_RETRIES` | `3` | Retry count |
144
+ | Variable | Default | Description |
145
+ | --------------------------------- | ---------------------- | ----------------- |
146
+ | `SMART_CODING_GEMINI_API_KEY` | — | API key |
147
+ | `SMART_CODING_GEMINI_MODEL` | `gemini-embedding-001` | Model |
148
+ | `SMART_CODING_GEMINI_DIMENSIONS` | `768` | Output dimensions |
149
+ | `SMART_CODING_GEMINI_BATCH_SIZE` | `24` | Micro-batch size |
150
+ | `SMART_CODING_GEMINI_MAX_RETRIES` | `3` | Retry count |
149
151
 
150
152
  ### OpenAI / Compatible
151
153
 
152
- | Variable | Default | Description |
153
- |----------|---------|-------------|
154
- | `SMART_CODING_EMBEDDING_API_KEY` | — | API key |
155
- | `SMART_CODING_EMBEDDING_BASE_URL` | — | Base URL (compatible only) |
154
+ | Variable | Default | Description |
155
+ | --------------------------------- | ------- | -------------------------- |
156
+ | `SMART_CODING_EMBEDDING_API_KEY` | — | API key |
157
+ | `SMART_CODING_EMBEDDING_BASE_URL` | — | Base URL (compatible only) |
156
158
 
157
159
  ### Vertex AI
158
160
 
159
- | Variable | Default | Description |
160
- |----------|---------|-------------|
161
- | `SMART_CODING_VERTEX_PROJECT` | — | GCP project ID |
162
- | `SMART_CODING_VERTEX_LOCATION` | `us-central1` | Region |
161
+ | Variable | Default | Description |
162
+ | ------------------------------ | ------------- | -------------- |
163
+ | `SMART_CODING_VERTEX_PROJECT` | — | GCP project ID |
164
+ | `SMART_CODING_VERTEX_LOCATION` | `us-central1` | Region |
163
165
 
164
166
  ### Vector Store
165
167
 
166
- | Variable | Default | Description |
167
- |----------|---------|-------------|
168
- | `SMART_CODING_VECTOR_STORE_PROVIDER` | `sqlite` | `sqlite` / `milvus` |
169
- | `SMART_CODING_MILVUS_ADDRESS` | — | Milvus endpoint |
170
- | `SMART_CODING_MILVUS_TOKEN` | — | Auth token |
171
- | `SMART_CODING_MILVUS_DATABASE` | `default` | Database name |
172
- | `SMART_CODING_MILVUS_COLLECTION` | `smart_coding_embeddings` | Collection |
168
+ | Variable | Default | Description |
169
+ | ------------------------------------ | ------------------------- | ------------------- |
170
+ | `SMART_CODING_VECTOR_STORE_PROVIDER` | `sqlite` | `sqlite` / `milvus` |
171
+ | `SMART_CODING_MILVUS_ADDRESS` | — | Milvus endpoint |
172
+ | `SMART_CODING_MILVUS_TOKEN` | — | Auth token |
173
+ | `SMART_CODING_MILVUS_DATABASE` | `default` | Database name |
174
+ | `SMART_CODING_MILVUS_COLLECTION` | `smart_coding_embeddings` | Collection |
173
175
 
174
176
  ### Search Tuning
175
177
 
176
- | Variable | Default | Description |
177
- |----------|---------|-------------|
178
- | `SMART_CODING_SEMANTIC_WEIGHT` | `0.7` | Semantic vs exact weight |
179
- | `SMART_CODING_EXACT_MATCH_BOOST` | `1.5` | Exact match multiplier |
178
+ | Variable | Default | Description |
179
+ | -------------------------------- | ------- | ------------------------ |
180
+ | `SMART_CODING_SEMANTIC_WEIGHT` | `0.7` | Semantic vs exact weight |
181
+ | `SMART_CODING_EXACT_MATCH_BOOST` | `1.5` | Exact match multiplier |
180
182
 
181
183
  ### Example with Gemini + Milvus
182
184
 
@@ -184,8 +186,8 @@ All settings via environment variables. Prefix: `SMART_CODING_`.
184
186
  {
185
187
  "mcpServers": {
186
188
  "semantic-code-mcp": {
187
- "command": "semantic-code-mcp",
188
- "args": ["--workspace", "/path/to/project"],
189
+ "command": "npx",
190
+ "args": ["-y", "semantic-code-mcp@latest", "--workspace", "/path/to/project"],
189
191
  "env": {
190
192
  "SMART_CODING_EMBEDDING_PROVIDER": "gemini",
191
193
  "SMART_CODING_GEMINI_API_KEY": "YOUR_KEY",
@@ -199,44 +201,64 @@ All settings via environment variables. Prefix: `SMART_CODING_`.
199
201
 
200
202
  ## Architecture
201
203
 
202
- ```
203
- semantic-code-mcp/
204
- ├── index.js # MCP server entry point
205
- ├── lib/
206
- │ ├── config.js # Configuration loader
207
- │ ├── cache-factory.js # SQLite / Milvus provider selection
208
- │ ├── cache.js # SQLite vector store
209
- │ ├── milvus-cache.js # Milvus vector store
210
- │ ├── mrl-embedder.js # Local MRL embedder
211
- │ ├── gemini-embedder.js# Gemini API embedder
212
- │ ├── ast-chunker.js # Tree-sitter AST chunking
213
- │ ├── tokenizer.js # Token counting
214
- │ └── utils.js # Cosine similarity, hashing, smart chunking
215
- ├── features/
216
- │ ├── hybrid-search.js # Semantic + exact match search
217
- │ ├── index-codebase.js # File discovery & incremental indexing
218
- │ ├── clear-cache.js # Cache reset
219
- │ ├── check-last-version.js # Package version lookup
220
- │ ├── set-workspace.js # Runtime workspace switching
221
- │ └── get-status.js # Server status
222
- └── test/ # Vitest test suite
204
+ ```mermaid
205
+ graph TB
206
+ subgraph MCP["MCP Server (index.js)"]
207
+ direction TB
208
+ CFG["config.js<br/>Configuration"]
209
+ end
210
+
211
+ subgraph Features
212
+ SEARCH["hybrid-search.js<br/>Semantic + Exact Match"]
213
+ INDEX["index-codebase.js<br/>File Discovery & Indexing"]
214
+ STATUS["get-status.js<br/>Server Health"]
215
+ WORKSPACE["set-workspace.js<br/>Runtime Switching"]
216
+ VERSION["check-last-version.js<br/>Registry Lookup"]
217
+ CLEAR["clear-cache.js<br/>Cache Reset"]
218
+ end
219
+
220
+ subgraph Embeddings["Embedding Providers"]
221
+ LOCAL["mrl-embedder.js<br/>nomic-embed-text v1.5"]
222
+ GEMINI["gemini-embedder.js<br/>Gemini / Vertex AI"]
223
+ OAI["OpenAI / Compatible"]
224
+ end
225
+
226
+ subgraph Storage["Vector Storage"]
227
+ SQLITE["cache.js<br/>SQLite (default)"]
228
+ MILVUS["milvus-cache.js<br/>Milvus ANN"]
229
+ FACTORY["cache-factory.js<br/>Provider Selection"]
230
+ end
231
+
232
+ subgraph Chunking["Code Chunking"]
233
+ AST["ast-chunker.js<br/>Tree-sitter AST"]
234
+ SMART["utils.js<br/>Smart Regex"]
235
+ end
236
+
237
+ MCP --> Features
238
+ INDEX --> Chunking --> Embeddings --> FACTORY
239
+ FACTORY --> SQLITE
240
+ FACTORY --> MILVUS
241
+ SEARCH --> Embeddings
242
+ SEARCH --> FACTORY
223
243
  ```
224
244
 
225
245
  ## How It Works
226
246
 
227
- ```
228
- Your code files
229
- glob + .gitignore-aware discovery
230
- Smart/AST chunking
231
- language-aware splitting
232
- AI embedding (local or API)
233
- ↓ vector generation
234
- SQLite or Milvus storage
235
- incremental, hash-based updates
236
-
237
- Search query
238
- embed query → cosine similarity → exact match boost
239
- Top N results with relevance scores
247
+ ```mermaid
248
+ flowchart LR
249
+ A["📁 Source Files"] -->|glob + .gitignore| B["✂️ Smart/AST\nChunking"]
250
+ B -->|language-aware| C["🧠 AI Embedding\n(Local or API)"]
251
+ C -->|vectors| D["💾 SQLite / Milvus\nStorage"]
252
+ D -->|incremental hash| D
253
+
254
+ E["🔍 Search Query"] -->|embed| C
255
+ C -->|cosine similarity| F["📊 Hybrid Scoring\nsemantic + exact match"]
256
+ F --> G["🎯 Top N Results\nwith relevance scores"]
257
+
258
+ style A fill:#2d3748,color:#e2e8f0
259
+ style C fill:#553c9a,color:#e9d8fd
260
+ style D fill:#2a4365,color:#bee3f8
261
+ style G fill:#22543d,color:#c6f6d5
240
262
  ```
241
263
 
242
264
  **Progressive indexing** — search works immediately while indexing continues in the background. Only changed files are re-indexed on subsequent runs.
@@ -256,4 +278,4 @@ See [LICENSE](LICENSE) for full text.
256
278
 
257
279
  ---
258
280
 
259
- *Built on [smart-coding-mcp](https://github.com/omarHaris/smart-coding-mcp) by Omar Haris. Extended with multi-provider embeddings, Milvus ANN search, AST chunking, resource throttling, and comprehensive test suite.*
281
+ *Forked from [smart-coding-mcp](https://github.com/omarHaris/smart-coding-mcp) by Omar Haris. Extended with multi-provider embeddings (Gemini, Vertex AI, OpenAI), Milvus ANN search, AST chunking, resource throttling, and comprehensive test suite.*
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "semantic-code-mcp",
3
- "version": "2.0.0",
3
+ "version": "2.1.0",
4
4
  "description": "AI-powered semantic code search for coding agents. MCP server with multi-provider embeddings and hybrid search.",
5
5
  "type": "module",
6
6
  "main": "index.js",
@@ -42,12 +42,12 @@
42
42
  ],
43
43
  "repository": {
44
44
  "type": "git",
45
- "url": "https://github.com/bitkyc08-arch/smart-coding-mcp.git"
45
+ "url": "https://github.com/bitkyc08-arch/semantic-code-mcp.git"
46
46
  },
47
47
  "bugs": {
48
- "url": "https://github.com/bitkyc08-arch/smart-coding-mcp/issues"
48
+ "url": "https://github.com/bitkyc08-arch/semantic-code-mcp/issues"
49
49
  },
50
- "homepage": "https://github.com/bitkyc08-arch/smart-coding-mcp#readme",
50
+ "homepage": "https://github.com/bitkyc08-arch/semantic-code-mcp#readme",
51
51
  "license": "MIT",
52
52
  "dependencies": {
53
53
  "@huggingface/transformers": "^3.8.1",