codebaxing 0.2.15 → 0.2.18
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +60 -302
- package/README.vi.md +64 -306
- package/dist/indexing/source-retriever.d.ts.map +1 -1
- package/dist/indexing/source-retriever.js +139 -107
- package/dist/indexing/source-retriever.js.map +1 -1
- package/dist/mcp/server.js +6 -2
- package/dist/mcp/server.js.map +1 -1
- package/dist/mcp/state.d.ts +1 -1
- package/dist/mcp/state.d.ts.map +1 -1
- package/dist/mcp/state.js +9 -6
- package/dist/mcp/state.js.map +1 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -5,196 +5,83 @@
|
|
|
5
5
|
|
|
6
6
|
**[English](README.md)** | [Tiếng Việt](README.vi.md)
|
|
7
7
|
|
|
8
|
-
MCP server for **semantic code search**. Index your codebase
|
|
8
|
+
MCP server for **semantic code search**. Index your codebase once, then search using natural language.
|
|
9
9
|
|
|
10
|
-
##
|
|
11
|
-
|
|
12
|
-
- [The Idea](#the-idea)
|
|
13
|
-
- [Quick Start](#quick-start)
|
|
14
|
-
- [Installation](#installation)
|
|
15
|
-
- [Usage](#usage)
|
|
16
|
-
- [Configuration](#configuration)
|
|
17
|
-
- [How It Works](#how-it-works)
|
|
18
|
-
|
|
19
|
-
## The Idea
|
|
20
|
-
|
|
21
|
-
Traditional code search (grep, ripgrep) matches exact text. But developers think in concepts:
|
|
22
|
-
|
|
23
|
-
- *"Where is the authentication logic?"* - not `grep "authentication"`
|
|
24
|
-
- *"Find database connection code"* - not `grep "database"`
|
|
25
|
-
|
|
26
|
-
**Codebaxing** bridges this gap using **semantic search**:
|
|
27
|
-
|
|
28
|
-
```
|
|
29
|
-
Query: "user authentication"
|
|
30
|
-
↓
|
|
31
|
-
Finds: login(), validateCredentials(), checkPassword(), authMiddleware()
|
|
32
|
-
(even if they don't contain the word "authentication")
|
|
33
|
-
```
|
|
34
|
-
|
|
35
|
-
## Quick Start
|
|
36
|
-
|
|
37
|
-
### Step 1: Start ChromaDB (Required)
|
|
38
|
-
|
|
39
|
-
ChromaDB is required for persistent storage. Start it with Docker:
|
|
40
|
-
|
|
41
|
-
```bash
|
|
42
|
-
docker run -d -p 8000:8000 --name chromadb chromadb/chroma
|
|
10
|
+
## How It Works
|
|
43
11
|
|
|
44
|
-
# Verify it's running
|
|
45
|
-
curl http://localhost:8000/api/v2/heartbeat
|
|
46
12
|
```
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
```bash
|
|
51
|
-
npx codebaxing install # Claude Desktop
|
|
52
|
-
npx codebaxing install --cursor # Cursor
|
|
53
|
-
npx codebaxing install --windsurf # Windsurf
|
|
54
|
-
npx codebaxing install --all # All editors
|
|
13
|
+
Your Code → Tree-sitter Parser → Symbols → Embedding Model → Vectors → ChromaDB
|
|
14
|
+
↓
|
|
15
|
+
"find auth logic" → Embedding → Query Vector → Similarity Search → Results
|
|
55
16
|
```
|
|
56
17
|
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
Restart your editor, then ask: *"Index my project at /path/to/myproject"*
|
|
18
|
+
Traditional search matches exact text. Codebaxing understands meaning:
|
|
60
19
|
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
export CHROMADB_URL=http://localhost:8000
|
|
66
|
-
|
|
67
|
-
# Index and search
|
|
68
|
-
npx codebaxing index /path/to/project
|
|
69
|
-
npx codebaxing search "authentication logic"
|
|
70
|
-
```
|
|
71
|
-
|
|
72
|
-
## Installation
|
|
20
|
+
| Query | Finds (even without exact match) |
|
|
21
|
+
|-------|----------------------------------|
|
|
22
|
+
| "authentication" | login(), validateCredentials(), authMiddleware() |
|
|
23
|
+
| "database connection" | connectDB(), prismaClient, repository.query() |
|
|
73
24
|
|
|
74
|
-
|
|
25
|
+
## Quick Start
|
|
75
26
|
|
|
76
|
-
|
|
27
|
+
### 1. Start ChromaDB
|
|
77
28
|
|
|
78
29
|
```bash
|
|
79
|
-
# Start ChromaDB with Docker
|
|
80
30
|
docker run -d -p 8000:8000 --name chromadb chromadb/chroma
|
|
81
|
-
|
|
82
|
-
# Verify it's running
|
|
83
|
-
curl http://localhost:8000/api/v2/heartbeat
|
|
84
|
-
# Should return: {"nanosecond heartbeat":...}
|
|
85
31
|
```
|
|
86
32
|
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
### Step 2: Install to AI Editor
|
|
33
|
+
### 2. Index Your Codebase (CLI)
|
|
90
34
|
|
|
91
35
|
```bash
|
|
92
|
-
npx codebaxing
|
|
93
|
-
npx codebaxing install --cursor # Cursor
|
|
94
|
-
npx codebaxing install --windsurf # Windsurf (Codeium)
|
|
95
|
-
npx codebaxing install --zed # Zed
|
|
96
|
-
npx codebaxing install --all # All supported editors
|
|
36
|
+
npx codebaxing index /path/to/your/project
|
|
97
37
|
```
|
|
98
38
|
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
### Step 3: Restart Your Editor
|
|
39
|
+
This creates a `.codebaxing/` folder with the index. Only needs to be done once per project.
|
|
102
40
|
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
### Uninstall
|
|
41
|
+
### 3. Install MCP Server for AI Editors
|
|
106
42
|
|
|
107
43
|
```bash
|
|
108
|
-
npx codebaxing
|
|
109
|
-
npx codebaxing
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
## Usage
|
|
113
|
-
|
|
114
|
-
### Via AI Agents (Claude Desktop, Cursor, etc.)
|
|
115
|
-
|
|
116
|
-
After installing, interact through natural conversation:
|
|
117
|
-
|
|
118
|
-
#### 1. Index your codebase (Required first)
|
|
119
|
-
|
|
120
|
-
```
|
|
121
|
-
You: Index the codebase at /Users/me/projects/myapp
|
|
122
|
-
```
|
|
123
|
-
|
|
124
|
-
> **Note:** First run downloads the embedding model (~90MB), takes 1-2 minutes.
|
|
125
|
-
|
|
126
|
-
#### 2. Search for code
|
|
127
|
-
|
|
128
|
-
```
|
|
129
|
-
You: Find code that handles user authentication
|
|
130
|
-
You: Where is the database connection logic?
|
|
131
|
-
You: Show me error handling patterns
|
|
132
|
-
```
|
|
133
|
-
|
|
134
|
-
#### 3. Use memory (optional)
|
|
135
|
-
|
|
136
|
-
```
|
|
137
|
-
You: Remember that we're using PostgreSQL with Prisma ORM
|
|
138
|
-
You: What decisions have we made about the database?
|
|
44
|
+
npx codebaxing install # Claude Desktop
|
|
45
|
+
npx codebaxing install --cursor # Cursor
|
|
46
|
+
npx codebaxing install --windsurf # Windsurf
|
|
47
|
+
npx codebaxing install --all # All editors
|
|
139
48
|
```
|
|
140
49
|
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
| Tool | Description | Example |
|
|
144
|
-
|------|-------------|---------|
|
|
145
|
-
| `index` | Index a codebase (**required first**) | `index(path="/project")` |
|
|
146
|
-
| `search` | Semantic code search | `search(question="auth middleware")` |
|
|
147
|
-
| `stats` | Index statistics | `stats()` |
|
|
148
|
-
| `languages` | Supported extensions | `languages()` |
|
|
149
|
-
| `remember` | Store memory | `remember(content="Using Redis", memory_type="decision")` |
|
|
150
|
-
| `recall` | Retrieve memories | `recall(query="database")` |
|
|
151
|
-
| `forget` | Delete memories | `forget(memory_type="note")` |
|
|
152
|
-
| `memory-stats` | Memory statistics | `memory-stats()` |
|
|
50
|
+
Restart your editor. Now you can ask: *"Find the authentication logic"*
|
|
153
51
|
|
|
154
|
-
|
|
52
|
+
## CLI Commands
|
|
155
53
|
|
|
156
|
-
|
|
54
|
+
| Command | Description |
|
|
55
|
+
|---------|-------------|
|
|
56
|
+
| `npx codebaxing index <path>` | Index a codebase (**required first**) |
|
|
57
|
+
| `npx codebaxing search <query>` | Search indexed code |
|
|
58
|
+
| `npx codebaxing stats [path]` | Show index statistics |
|
|
59
|
+
| `npx codebaxing install [--editor]` | Install MCP server |
|
|
60
|
+
| `npx codebaxing uninstall [--editor]` | Uninstall MCP server |
|
|
157
61
|
|
|
158
|
-
|
|
62
|
+
### Search Options
|
|
159
63
|
|
|
160
64
|
```bash
|
|
161
|
-
|
|
162
|
-
docker run -d -p 8000:8000 --name chromadb chromadb/chroma
|
|
163
|
-
|
|
164
|
-
# Set environment variable (add to your ~/.bashrc or ~/.zshrc)
|
|
165
|
-
export CHROMADB_URL=http://localhost:8000
|
|
65
|
+
npx codebaxing search "auth middleware" --path ./src --limit 10
|
|
166
66
|
```
|
|
167
67
|
|
|
168
|
-
|
|
68
|
+
- `--path, -p` - Codebase path (default: current directory)
|
|
69
|
+
- `--limit, -n` - Number of results (default: 5)
|
|
169
70
|
|
|
170
|
-
|
|
171
|
-
# Index a codebase
|
|
172
|
-
npx codebaxing index /path/to/project
|
|
173
|
-
|
|
174
|
-
# Search for code
|
|
175
|
-
npx codebaxing search "authentication middleware"
|
|
176
|
-
npx codebaxing search "database connection" --path ./src --limit 10
|
|
177
|
-
|
|
178
|
-
# Show index statistics
|
|
179
|
-
npx codebaxing stats /path/to/project
|
|
180
|
-
|
|
181
|
-
# Show help
|
|
182
|
-
npx codebaxing --help
|
|
183
|
-
```
|
|
71
|
+
## MCP Tools (for AI Agents)
|
|
184
72
|
|
|
185
|
-
|
|
73
|
+
After installing, AI agents can use these tools:
|
|
186
74
|
|
|
187
|
-
|
|
|
188
|
-
|
|
189
|
-
| `
|
|
190
|
-
| `
|
|
191
|
-
| `
|
|
192
|
-
| `
|
|
193
|
-
| `
|
|
75
|
+
| Tool | Description |
|
|
76
|
+
|------|-------------|
|
|
77
|
+
| `search` | Semantic code search |
|
|
78
|
+
| `stats` | Index statistics |
|
|
79
|
+
| `languages` | Supported file extensions |
|
|
80
|
+
| `remember` | Store project memory |
|
|
81
|
+
| `recall` | Retrieve memories |
|
|
82
|
+
| `forget` | Delete memories |
|
|
194
83
|
|
|
195
|
-
**
|
|
196
|
-
- `--path, -p <path>` - Codebase path (default: current directory)
|
|
197
|
-
- `--limit, -n <number>` - Number of results (default: 5)
|
|
84
|
+
> **Note:** The `index` tool is disabled for AI agents. Use CLI to index: `npx codebaxing index <path>`
|
|
198
85
|
|
|
199
86
|
## Configuration
|
|
200
87
|
|
|
@@ -202,29 +89,18 @@ npx codebaxing --help
|
|
|
202
89
|
|
|
203
90
|
| Variable | Description | Default |
|
|
204
91
|
|----------|-------------|---------|
|
|
205
|
-
| `CHROMADB_URL` | ChromaDB server URL
|
|
206
|
-
| `CODEBAXING_DEVICE` | Compute
|
|
207
|
-
| `CODEBAXING_MAX_FILE_SIZE` | Max file size
|
|
208
|
-
|
|
209
|
-
|
|
210
|
-
|
|
211
|
-
```bash
|
|
212
|
-
export CODEBAXING_DEVICE=webgpu # macOS (Metal)
|
|
213
|
-
export CODEBAXING_DEVICE=cuda # Linux/Windows (NVIDIA)
|
|
214
|
-
export CODEBAXING_DEVICE=auto # Auto-detect
|
|
215
|
-
```
|
|
216
|
-
|
|
217
|
-
**Note:** macOS does not support CUDA. Use `webgpu` for GPU acceleration on Mac.
|
|
92
|
+
| `CHROMADB_URL` | ChromaDB server URL | `http://localhost:8000` |
|
|
93
|
+
| `CODEBAXING_DEVICE` | Compute: `cpu`, `webgpu`, `cuda` | `cpu` |
|
|
94
|
+
| `CODEBAXING_MAX_FILE_SIZE` | Max file size in MB | `1` |
|
|
95
|
+
| `CODEBAXING_MAX_CHUNKS` | Max chunks to index | `100000` |
|
|
96
|
+
| `CODEBAXING_FILES_PER_BATCH` | Files per batch (lower = less RAM) | `50` |
|
|
218
97
|
|
|
219
|
-
### Manual Editor
|
|
220
|
-
|
|
221
|
-
If you prefer to configure manually instead of using `npx codebaxing install`:
|
|
98
|
+
### Manual Editor Config
|
|
222
99
|
|
|
223
100
|
<details>
|
|
224
101
|
<summary>Claude Desktop</summary>
|
|
225
102
|
|
|
226
|
-
`~/Library/Application Support/Claude/claude_desktop_config.json`
|
|
227
|
-
`%APPDATA%\Claude\claude_desktop_config.json` (Windows)
|
|
103
|
+
`~/Library/Application Support/Claude/claude_desktop_config.json`
|
|
228
104
|
|
|
229
105
|
```json
|
|
230
106
|
{
|
|
@@ -232,9 +108,7 @@ If you prefer to configure manually instead of using `npx codebaxing install`:
|
|
|
232
108
|
"codebaxing": {
|
|
233
109
|
"command": "npx",
|
|
234
110
|
"args": ["-y", "codebaxing"],
|
|
235
|
-
"env": {
|
|
236
|
-
"CHROMADB_URL": "http://localhost:8000"
|
|
237
|
-
}
|
|
111
|
+
"env": { "CHROMADB_URL": "http://localhost:8000" }
|
|
238
112
|
}
|
|
239
113
|
}
|
|
240
114
|
}
|
|
@@ -252,51 +126,7 @@ If you prefer to configure manually instead of using `npx codebaxing install`:
|
|
|
252
126
|
"codebaxing": {
|
|
253
127
|
"command": "npx",
|
|
254
128
|
"args": ["-y", "codebaxing"],
|
|
255
|
-
"env": {
|
|
256
|
-
"CHROMADB_URL": "http://localhost:8000"
|
|
257
|
-
}
|
|
258
|
-
}
|
|
259
|
-
}
|
|
260
|
-
}
|
|
261
|
-
```
|
|
262
|
-
</details>
|
|
263
|
-
|
|
264
|
-
<details>
|
|
265
|
-
<summary>Windsurf</summary>
|
|
266
|
-
|
|
267
|
-
`~/.codeium/windsurf/mcp_config.json`
|
|
268
|
-
|
|
269
|
-
```json
|
|
270
|
-
{
|
|
271
|
-
"mcpServers": {
|
|
272
|
-
"codebaxing": {
|
|
273
|
-
"command": "npx",
|
|
274
|
-
"args": ["-y", "codebaxing"],
|
|
275
|
-
"env": {
|
|
276
|
-
"CHROMADB_URL": "http://localhost:8000"
|
|
277
|
-
}
|
|
278
|
-
}
|
|
279
|
-
}
|
|
280
|
-
}
|
|
281
|
-
```
|
|
282
|
-
</details>
|
|
283
|
-
|
|
284
|
-
<details>
|
|
285
|
-
<summary>Zed</summary>
|
|
286
|
-
|
|
287
|
-
`~/.config/zed/settings.json`
|
|
288
|
-
|
|
289
|
-
```json
|
|
290
|
-
{
|
|
291
|
-
"context_servers": {
|
|
292
|
-
"codebaxing": {
|
|
293
|
-
"command": {
|
|
294
|
-
"path": "npx",
|
|
295
|
-
"args": ["-y", "codebaxing"]
|
|
296
|
-
},
|
|
297
|
-
"env": {
|
|
298
|
-
"CHROMADB_URL": "http://localhost:8000"
|
|
299
|
-
}
|
|
129
|
+
"env": { "CHROMADB_URL": "http://localhost:8000" }
|
|
300
130
|
}
|
|
301
131
|
}
|
|
302
132
|
}
|
|
@@ -304,105 +134,33 @@ If you prefer to configure manually instead of using `npx codebaxing install`:
|
|
|
304
134
|
</details>
|
|
305
135
|
|
|
306
136
|
<details>
|
|
307
|
-
<summary>
|
|
137
|
+
<summary>Other Editors</summary>
|
|
308
138
|
|
|
309
|
-
`~/.
|
|
139
|
+
**Windsurf:** `~/.codeium/windsurf/mcp_config.json`
|
|
140
|
+
**Zed:** `~/.config/zed/settings.json` (use `context_servers` key)
|
|
141
|
+
**VS Code + Continue:** `~/.continue/config.json`
|
|
310
142
|
|
|
311
|
-
```json
|
|
312
|
-
{
|
|
313
|
-
"experimental": {
|
|
314
|
-
"modelContextProtocolServers": [
|
|
315
|
-
{
|
|
316
|
-
"transport": {
|
|
317
|
-
"type": "stdio",
|
|
318
|
-
"command": "npx",
|
|
319
|
-
"args": ["-y", "codebaxing"],
|
|
320
|
-
"env": {
|
|
321
|
-
"CHROMADB_URL": "http://localhost:8000"
|
|
322
|
-
}
|
|
323
|
-
}
|
|
324
|
-
}
|
|
325
|
-
]
|
|
326
|
-
}
|
|
327
|
-
}
|
|
328
|
-
```
|
|
329
143
|
</details>
|
|
330
144
|
|
|
331
|
-
## How It Works
|
|
332
|
-
|
|
333
|
-
### Architecture
|
|
334
|
-
|
|
335
|
-
```
|
|
336
|
-
┌─────────────────────────────────────────────────────────────────┐
|
|
337
|
-
│ INDEXING │
|
|
338
|
-
├─────────────────────────────────────────────────────────────────┤
|
|
339
|
-
│ Source Files (.py, .ts, .js, .go, .rs, ...) │
|
|
340
|
-
│ │ │
|
|
341
|
-
│ ▼ │
|
|
342
|
-
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
|
343
|
-
│ │ Tree-sitter │───▶│ Symbols │───▶│ Embedding │ │
|
|
344
|
-
│ │ Parser │ │ Extraction │ │ Model │ │
|
|
345
|
-
│ └──────────────┘ └──────────────┘ └──────────────┘ │
|
|
346
|
-
│ │ │ │ │
|
|
347
|
-
│ Parse AST Functions, Text → Vector │
|
|
348
|
-
│ Classes, etc. (384 dimensions) │
|
|
349
|
-
│ │ │
|
|
350
|
-
│ ▼ │
|
|
351
|
-
│ ┌──────────────┐ │
|
|
352
|
-
│ │ ChromaDB │ │
|
|
353
|
-
│ │ (vectors) │ │
|
|
354
|
-
│ └──────────────┘ │
|
|
355
|
-
└─────────────────────────────────────────────────────────────────┘
|
|
356
|
-
```
|
|
357
|
-
|
|
358
|
-
### Why Semantic Search Works
|
|
359
|
-
|
|
360
|
-
The embedding model understands that:
|
|
361
|
-
|
|
362
|
-
| Query | Finds (even without exact match) |
|
|
363
|
-
|-------|----------------------------------|
|
|
364
|
-
| "authentication" | login, credentials, auth, signin, validateUser |
|
|
365
|
-
| "database" | query, SQL, connection, ORM, repository |
|
|
366
|
-
| "error handling" | try/catch, exception, throw, ErrorBoundary |
|
|
367
|
-
|
|
368
145
|
## Supported Languages
|
|
369
146
|
|
|
370
|
-
Python, JavaScript, TypeScript,
|
|
371
|
-
|
|
372
|
-
## Features
|
|
373
|
-
|
|
374
|
-
- **Semantic Code Search**: Find code by describing what you're looking for
|
|
375
|
-
- **24+ Languages**: Python, TypeScript, JavaScript, Go, Rust, Java, C/C++, and more
|
|
376
|
-
- **Memory Layer**: Store and recall project context across sessions
|
|
377
|
-
- **Incremental Indexing**: Only re-index changed files
|
|
378
|
-
- **100% Local**: No API calls, no cloud, works offline
|
|
379
|
-
- **GPU Acceleration**: Optional WebGPU/CUDA support
|
|
147
|
+
Python, JavaScript, TypeScript, Go, Rust, Java, C/C++, C#, Ruby, PHP, Kotlin, Swift, Scala, Lua, Dart, Elixir, Haskell, OCaml, Zig, Perl, Bash, HTML, CSS, Vue, JSON, YAML, TOML, Makefile
|
|
380
148
|
|
|
381
149
|
## Requirements
|
|
382
150
|
|
|
383
151
|
- Node.js >= 20.0.0
|
|
384
|
-
- Docker (for ChromaDB
|
|
385
|
-
- ~500MB disk space
|
|
152
|
+
- Docker (for ChromaDB)
|
|
153
|
+
- ~500MB disk space (embedding model)
|
|
386
154
|
|
|
387
155
|
## Technical Details
|
|
388
156
|
|
|
389
157
|
| Component | Technology |
|
|
390
158
|
|-----------|------------|
|
|
391
159
|
| Embedding Model | `all-MiniLM-L6-v2` (384 dimensions) |
|
|
392
|
-
| Model Runtime | `@huggingface/transformers` (ONNX) |
|
|
393
160
|
| Vector Database | ChromaDB |
|
|
394
161
|
| Code Parser | Tree-sitter |
|
|
395
162
|
| MCP SDK | `@modelcontextprotocol/sdk` |
|
|
396
163
|
|
|
397
|
-
## Development
|
|
398
|
-
|
|
399
|
-
```bash
|
|
400
|
-
npm install # Install dependencies
|
|
401
|
-
npm run dev # Run with tsx (no build needed)
|
|
402
|
-
npm run build # Compile TypeScript
|
|
403
|
-
npm test # Run tests
|
|
404
|
-
```
|
|
405
|
-
|
|
406
164
|
## License
|
|
407
165
|
|
|
408
166
|
MIT
|