codebaxing 0.2.16 → 0.2.20
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +61 -301
- package/README.vi.md +65 -305
- package/dist/indexing/source-retriever.d.ts.map +1 -1
- package/dist/indexing/source-retriever.js +139 -107
- package/dist/indexing/source-retriever.js.map +1 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -5,196 +5,85 @@
|
|
|
5
5
|
|
|
6
6
|
**[English](README.md)** | [Tiếng Việt](README.vi.md)
|
|
7
7
|
|
|
8
|
-
MCP server for **semantic code search**. Index your codebase
|
|
8
|
+
MCP server for **semantic code search**. Index your codebase once, then search using natural language.
|
|
9
9
|
|
|
10
|
-
##
|
|
11
|
-
|
|
12
|
-
- [The Idea](#the-idea)
|
|
13
|
-
- [Quick Start](#quick-start)
|
|
14
|
-
- [Installation](#installation)
|
|
15
|
-
- [Usage](#usage)
|
|
16
|
-
- [Configuration](#configuration)
|
|
17
|
-
- [How It Works](#how-it-works)
|
|
18
|
-
|
|
19
|
-
## The Idea
|
|
20
|
-
|
|
21
|
-
Traditional code search (grep, ripgrep) matches exact text. But developers think in concepts:
|
|
22
|
-
|
|
23
|
-
- *"Where is the authentication logic?"* - not `grep "authentication"`
|
|
24
|
-
- *"Find database connection code"* - not `grep "database"`
|
|
25
|
-
|
|
26
|
-
**Codebaxing** bridges this gap using **semantic search**:
|
|
27
|
-
|
|
28
|
-
```
|
|
29
|
-
Query: "user authentication"
|
|
30
|
-
↓
|
|
31
|
-
Finds: login(), validateCredentials(), checkPassword(), authMiddleware()
|
|
32
|
-
(even if they don't contain the word "authentication")
|
|
33
|
-
```
|
|
34
|
-
|
|
35
|
-
## Quick Start
|
|
36
|
-
|
|
37
|
-
### Step 1: Start ChromaDB (Required)
|
|
38
|
-
|
|
39
|
-
ChromaDB is required for persistent storage. Start it with Docker:
|
|
40
|
-
|
|
41
|
-
```bash
|
|
42
|
-
docker run -d -p 8000:8000 --name chromadb chromadb/chroma
|
|
10
|
+
## How It Works
|
|
43
11
|
|
|
44
|
-
# Verify it's running
|
|
45
|
-
curl http://localhost:8000/api/v2/heartbeat
|
|
46
12
|
```
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
```bash
|
|
51
|
-
npx codebaxing install # Claude Desktop
|
|
52
|
-
npx codebaxing install --cursor # Cursor
|
|
53
|
-
npx codebaxing install --windsurf # Windsurf
|
|
54
|
-
npx codebaxing install --all # All editors
|
|
13
|
+
Your Code → Tree-sitter Parser → Symbols → Embedding Model → Vectors → ChromaDB
|
|
14
|
+
↓
|
|
15
|
+
"find auth logic" → Embedding → Query Vector → Similarity Search → Results
|
|
55
16
|
```
|
|
56
17
|
|
|
57
|
-
|
|
58
|
-
|
|
59
|
-
Restart your editor, then ask: *"Index my project at /path/to/myproject"*
|
|
18
|
+
Traditional search matches exact text. Codebaxing understands meaning:
|
|
60
19
|
|
|
61
|
-
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
export CHROMADB_URL=http://localhost:8000
|
|
66
|
-
|
|
67
|
-
# Index and search
|
|
68
|
-
npx codebaxing index /path/to/project
|
|
69
|
-
npx codebaxing search "authentication logic"
|
|
70
|
-
```
|
|
71
|
-
|
|
72
|
-
## Installation
|
|
20
|
+
| Query | Finds (even without exact match) |
|
|
21
|
+
|-------|----------------------------------|
|
|
22
|
+
| "authentication" | login(), validateCredentials(), authMiddleware() |
|
|
23
|
+
| "database connection" | connectDB(), prismaClient, repository.query() |
|
|
73
24
|
|
|
74
|
-
|
|
25
|
+
## Quick Start
|
|
75
26
|
|
|
76
|
-
|
|
27
|
+
### 1. Start ChromaDB
|
|
77
28
|
|
|
78
29
|
```bash
|
|
79
|
-
# Start ChromaDB with Docker
|
|
80
30
|
docker run -d -p 8000:8000 --name chromadb chromadb/chroma
|
|
81
|
-
|
|
82
|
-
# Verify it's running
|
|
83
|
-
curl http://localhost:8000/api/v2/heartbeat
|
|
84
|
-
# Should return: {"nanosecond heartbeat":...}
|
|
85
31
|
```
|
|
86
32
|
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
### Step 2: Install to AI Editor
|
|
33
|
+
### 2. Index Your Codebase (CLI)
|
|
90
34
|
|
|
91
35
|
```bash
|
|
92
|
-
npx codebaxing
|
|
93
|
-
npx codebaxing install --cursor # Cursor
|
|
94
|
-
npx codebaxing install --windsurf # Windsurf (Codeium)
|
|
95
|
-
npx codebaxing install --zed # Zed
|
|
96
|
-
npx codebaxing install --all # All supported editors
|
|
36
|
+
npx codebaxing@latest index /path/to/your/project
|
|
97
37
|
```
|
|
98
38
|
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
### Step 3: Restart Your Editor
|
|
39
|
+
This creates a `.codebaxing/` folder with the index. Only needs to be done once per project.
|
|
102
40
|
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
### Uninstall
|
|
41
|
+
### 3. Install MCP Server for AI Editors
|
|
106
42
|
|
|
107
43
|
```bash
|
|
108
|
-
npx codebaxing
|
|
109
|
-
npx codebaxing
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
## Usage
|
|
113
|
-
|
|
114
|
-
### Via AI Agents (Claude Desktop, Cursor, etc.)
|
|
115
|
-
|
|
116
|
-
After installing, interact through natural conversation:
|
|
117
|
-
|
|
118
|
-
#### 1. Index your codebase (Required first)
|
|
119
|
-
|
|
120
|
-
```
|
|
121
|
-
You: Index the codebase at /Users/me/projects/myapp
|
|
122
|
-
```
|
|
123
|
-
|
|
124
|
-
> **Note:** First run downloads the embedding model (~90MB), takes 1-2 minutes.
|
|
125
|
-
|
|
126
|
-
#### 2. Search for code
|
|
127
|
-
|
|
128
|
-
```
|
|
129
|
-
You: Find code that handles user authentication
|
|
130
|
-
You: Where is the database connection logic?
|
|
131
|
-
You: Show me error handling patterns
|
|
132
|
-
```
|
|
133
|
-
|
|
134
|
-
#### 3. Use memory (optional)
|
|
135
|
-
|
|
136
|
-
```
|
|
137
|
-
You: Remember that we're using PostgreSQL with Prisma ORM
|
|
138
|
-
You: What decisions have we made about the database?
|
|
44
|
+
npx codebaxing install # Claude Desktop
|
|
45
|
+
npx codebaxing install --cursor # Cursor
|
|
46
|
+
npx codebaxing install --windsurf # Windsurf
|
|
47
|
+
npx codebaxing install --all # All editors
|
|
139
48
|
```
|
|
140
49
|
|
|
141
|
-
|
|
50
|
+
Restart your editor. Now you can ask: *"Find the authentication logic"*
|
|
142
51
|
|
|
143
|
-
|
|
144
|
-
|------|-------------|---------|
|
|
145
|
-
| `index` | Index a codebase (**required first**) | `index(path="/project")` |
|
|
146
|
-
| `search` | Semantic code search | `search(question="auth middleware")` |
|
|
147
|
-
| `stats` | Index statistics | `stats()` |
|
|
148
|
-
| `languages` | Supported extensions | `languages()` |
|
|
149
|
-
| `remember` | Store memory | `remember(content="Using Redis", memory_type="decision")` |
|
|
150
|
-
| `recall` | Retrieve memories | `recall(query="database")` |
|
|
151
|
-
| `forget` | Delete memories | `forget(memory_type="note")` |
|
|
152
|
-
| `memory-stats` | Memory statistics | `memory-stats()` |
|
|
52
|
+
## CLI Commands
|
|
153
53
|
|
|
154
|
-
|
|
54
|
+
| Command | Description |
|
|
55
|
+
|---------|-------------|
|
|
56
|
+
| `npx codebaxing@latest index <path>` | Index a codebase (**required first**) |
|
|
57
|
+
| `npx codebaxing search <query>` | Search indexed code |
|
|
58
|
+
| `npx codebaxing stats [path]` | Show index statistics |
|
|
59
|
+
| `npx codebaxing install [--editor]` | Install MCP server |
|
|
60
|
+
| `npx codebaxing uninstall [--editor]` | Uninstall MCP server |
|
|
155
61
|
|
|
156
|
-
|
|
62
|
+
> **Tip:** Use `@latest` for `index` to ensure you have the newest version.
|
|
157
63
|
|
|
158
|
-
|
|
64
|
+
### Search Options
|
|
159
65
|
|
|
160
66
|
```bash
|
|
161
|
-
|
|
162
|
-
docker run -d -p 8000:8000 --name chromadb chromadb/chroma
|
|
163
|
-
|
|
164
|
-
# Set environment variable (add to your ~/.bashrc or ~/.zshrc)
|
|
165
|
-
export CHROMADB_URL=http://localhost:8000
|
|
67
|
+
npx codebaxing search "auth middleware" --path ./src --limit 10
|
|
166
68
|
```
|
|
167
69
|
|
|
168
|
-
|
|
70
|
+
- `--path, -p` - Codebase path (default: current directory)
|
|
71
|
+
- `--limit, -n` - Number of results (default: 5)
|
|
169
72
|
|
|
170
|
-
|
|
171
|
-
# Index a codebase
|
|
172
|
-
npx codebaxing index /path/to/project
|
|
73
|
+
## MCP Tools (for AI Agents)
|
|
173
74
|
|
|
174
|
-
|
|
175
|
-
npx codebaxing search "authentication middleware"
|
|
176
|
-
npx codebaxing search "database connection" --path ./src --limit 10
|
|
75
|
+
After installing, AI agents can use these tools:
|
|
177
76
|
|
|
178
|
-
|
|
179
|
-
|
|
77
|
+
| Tool | Description |
|
|
78
|
+
|------|-------------|
|
|
79
|
+
| `search` | Semantic code search |
|
|
80
|
+
| `stats` | Index statistics |
|
|
81
|
+
| `languages` | Supported file extensions |
|
|
82
|
+
| `remember` | Store project memory |
|
|
83
|
+
| `recall` | Retrieve memories |
|
|
84
|
+
| `forget` | Delete memories |
|
|
180
85
|
|
|
181
|
-
|
|
182
|
-
npx codebaxing --help
|
|
183
|
-
```
|
|
184
|
-
|
|
185
|
-
#### CLI Reference
|
|
186
|
-
|
|
187
|
-
| Command | Description |
|
|
188
|
-
|---------|-------------|
|
|
189
|
-
| `npx codebaxing install [--editor]` | Install MCP server to AI editor |
|
|
190
|
-
| `npx codebaxing uninstall [--editor]` | Uninstall MCP server |
|
|
191
|
-
| `npx codebaxing index <path>` | Index a codebase |
|
|
192
|
-
| `npx codebaxing search <query> [options]` | Search indexed codebase |
|
|
193
|
-
| `npx codebaxing stats [path]` | Show index statistics |
|
|
194
|
-
|
|
195
|
-
**Search options:**
|
|
196
|
-
- `--path, -p <path>` - Codebase path (default: current directory)
|
|
197
|
-
- `--limit, -n <number>` - Number of results (default: 5)
|
|
86
|
+
> **Note:** The `index` tool is disabled for AI agents. Use CLI: `npx codebaxing@latest index <path>`
|
|
198
87
|
|
|
199
88
|
## Configuration
|
|
200
89
|
|
|
@@ -202,29 +91,18 @@ npx codebaxing --help
|
|
|
202
91
|
|
|
203
92
|
| Variable | Description | Default |
|
|
204
93
|
|----------|-------------|---------|
|
|
205
|
-
| `CHROMADB_URL` | ChromaDB server URL
|
|
206
|
-
| `CODEBAXING_DEVICE` | Compute
|
|
207
|
-
| `CODEBAXING_MAX_FILE_SIZE` | Max file size
|
|
94
|
+
| `CHROMADB_URL` | ChromaDB server URL | `http://localhost:8000` |
|
|
95
|
+
| `CODEBAXING_DEVICE` | Compute: `cpu`, `webgpu`, `cuda` | `cpu` |
|
|
96
|
+
| `CODEBAXING_MAX_FILE_SIZE` | Max file size in MB | `1` |
|
|
97
|
+
| `CODEBAXING_MAX_CHUNKS` | Max chunks to index | `100000` |
|
|
98
|
+
| `CODEBAXING_FILES_PER_BATCH` | Files per batch (lower = less RAM) | `50` |
|
|
208
99
|
|
|
209
|
-
###
|
|
210
|
-
|
|
211
|
-
```bash
|
|
212
|
-
export CODEBAXING_DEVICE=webgpu # macOS (Metal)
|
|
213
|
-
export CODEBAXING_DEVICE=cuda # Linux/Windows (NVIDIA)
|
|
214
|
-
export CODEBAXING_DEVICE=auto # Auto-detect
|
|
215
|
-
```
|
|
216
|
-
|
|
217
|
-
**Note:** macOS does not support CUDA. Use `webgpu` for GPU acceleration on Mac.
|
|
218
|
-
|
|
219
|
-
### Manual Editor Configuration
|
|
220
|
-
|
|
221
|
-
If you prefer to configure manually instead of using `npx codebaxing install`:
|
|
100
|
+
### Manual Editor Config
|
|
222
101
|
|
|
223
102
|
<details>
|
|
224
103
|
<summary>Claude Desktop</summary>
|
|
225
104
|
|
|
226
|
-
`~/Library/Application Support/Claude/claude_desktop_config.json`
|
|
227
|
-
`%APPDATA%\Claude\claude_desktop_config.json` (Windows)
|
|
105
|
+
`~/Library/Application Support/Claude/claude_desktop_config.json`
|
|
228
106
|
|
|
229
107
|
```json
|
|
230
108
|
{
|
|
@@ -232,9 +110,7 @@ If you prefer to configure manually instead of using `npx codebaxing install`:
|
|
|
232
110
|
"codebaxing": {
|
|
233
111
|
"command": "npx",
|
|
234
112
|
"args": ["-y", "codebaxing"],
|
|
235
|
-
"env": {
|
|
236
|
-
"CHROMADB_URL": "http://localhost:8000"
|
|
237
|
-
}
|
|
113
|
+
"env": { "CHROMADB_URL": "http://localhost:8000" }
|
|
238
114
|
}
|
|
239
115
|
}
|
|
240
116
|
}
|
|
@@ -252,9 +128,7 @@ If you prefer to configure manually instead of using `npx codebaxing install`:
|
|
|
252
128
|
"codebaxing": {
|
|
253
129
|
"command": "npx",
|
|
254
130
|
"args": ["-y", "codebaxing"],
|
|
255
|
-
"env": {
|
|
256
|
-
"CHROMADB_URL": "http://localhost:8000"
|
|
257
|
-
}
|
|
131
|
+
"env": { "CHROMADB_URL": "http://localhost:8000" }
|
|
258
132
|
}
|
|
259
133
|
}
|
|
260
134
|
}
|
|
@@ -262,147 +136,33 @@ If you prefer to configure manually instead of using `npx codebaxing install`:
|
|
|
262
136
|
</details>
|
|
263
137
|
|
|
264
138
|
<details>
|
|
265
|
-
<summary>
|
|
139
|
+
<summary>Other Editors</summary>
|
|
266
140
|
|
|
267
|
-
`~/.codeium/windsurf/mcp_config.json`
|
|
141
|
+
**Windsurf:** `~/.codeium/windsurf/mcp_config.json`
|
|
142
|
+
**Zed:** `~/.config/zed/settings.json` (use `context_servers` key)
|
|
143
|
+
**VS Code + Continue:** `~/.continue/config.json`
|
|
268
144
|
|
|
269
|
-
```json
|
|
270
|
-
{
|
|
271
|
-
"mcpServers": {
|
|
272
|
-
"codebaxing": {
|
|
273
|
-
"command": "npx",
|
|
274
|
-
"args": ["-y", "codebaxing"],
|
|
275
|
-
"env": {
|
|
276
|
-
"CHROMADB_URL": "http://localhost:8000"
|
|
277
|
-
}
|
|
278
|
-
}
|
|
279
|
-
}
|
|
280
|
-
}
|
|
281
|
-
```
|
|
282
145
|
</details>
|
|
283
146
|
|
|
284
|
-
<details>
|
|
285
|
-
<summary>Zed</summary>
|
|
286
|
-
|
|
287
|
-
`~/.config/zed/settings.json`
|
|
288
|
-
|
|
289
|
-
```json
|
|
290
|
-
{
|
|
291
|
-
"context_servers": {
|
|
292
|
-
"codebaxing": {
|
|
293
|
-
"command": {
|
|
294
|
-
"path": "npx",
|
|
295
|
-
"args": ["-y", "codebaxing"]
|
|
296
|
-
},
|
|
297
|
-
"env": {
|
|
298
|
-
"CHROMADB_URL": "http://localhost:8000"
|
|
299
|
-
}
|
|
300
|
-
}
|
|
301
|
-
}
|
|
302
|
-
}
|
|
303
|
-
```
|
|
304
|
-
</details>
|
|
305
|
-
|
|
306
|
-
<details>
|
|
307
|
-
<summary>VS Code + Continue</summary>
|
|
308
|
-
|
|
309
|
-
`~/.continue/config.json`
|
|
310
|
-
|
|
311
|
-
```json
|
|
312
|
-
{
|
|
313
|
-
"experimental": {
|
|
314
|
-
"modelContextProtocolServers": [
|
|
315
|
-
{
|
|
316
|
-
"transport": {
|
|
317
|
-
"type": "stdio",
|
|
318
|
-
"command": "npx",
|
|
319
|
-
"args": ["-y", "codebaxing"],
|
|
320
|
-
"env": {
|
|
321
|
-
"CHROMADB_URL": "http://localhost:8000"
|
|
322
|
-
}
|
|
323
|
-
}
|
|
324
|
-
}
|
|
325
|
-
]
|
|
326
|
-
}
|
|
327
|
-
}
|
|
328
|
-
```
|
|
329
|
-
</details>
|
|
330
|
-
|
|
331
|
-
## How It Works
|
|
332
|
-
|
|
333
|
-
### Architecture
|
|
334
|
-
|
|
335
|
-
```
|
|
336
|
-
┌─────────────────────────────────────────────────────────────────┐
|
|
337
|
-
│ INDEXING │
|
|
338
|
-
├─────────────────────────────────────────────────────────────────┤
|
|
339
|
-
│ Source Files (.py, .ts, .js, .go, .rs, ...) │
|
|
340
|
-
│ │ │
|
|
341
|
-
│ ▼ │
|
|
342
|
-
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
|
343
|
-
│ │ Tree-sitter │───▶│ Symbols │───▶│ Embedding │ │
|
|
344
|
-
│ │ Parser │ │ Extraction │ │ Model │ │
|
|
345
|
-
│ └──────────────┘ └──────────────┘ └──────────────┘ │
|
|
346
|
-
│ │ │ │ │
|
|
347
|
-
│ Parse AST Functions, Text → Vector │
|
|
348
|
-
│ Classes, etc. (384 dimensions) │
|
|
349
|
-
│ │ │
|
|
350
|
-
│ ▼ │
|
|
351
|
-
│ ┌──────────────┐ │
|
|
352
|
-
│ │ ChromaDB │ │
|
|
353
|
-
│ │ (vectors) │ │
|
|
354
|
-
│ └──────────────┘ │
|
|
355
|
-
└─────────────────────────────────────────────────────────────────┘
|
|
356
|
-
```
|
|
357
|
-
|
|
358
|
-
### Why Semantic Search Works
|
|
359
|
-
|
|
360
|
-
The embedding model understands that:
|
|
361
|
-
|
|
362
|
-
| Query | Finds (even without exact match) |
|
|
363
|
-
|-------|----------------------------------|
|
|
364
|
-
| "authentication" | login, credentials, auth, signin, validateUser |
|
|
365
|
-
| "database" | query, SQL, connection, ORM, repository |
|
|
366
|
-
| "error handling" | try/catch, exception, throw, ErrorBoundary |
|
|
367
|
-
|
|
368
147
|
## Supported Languages
|
|
369
148
|
|
|
370
|
-
Python, JavaScript, TypeScript,
|
|
371
|
-
|
|
372
|
-
## Features
|
|
373
|
-
|
|
374
|
-
- **Semantic Code Search**: Find code by describing what you're looking for
|
|
375
|
-
- **24+ Languages**: Python, TypeScript, JavaScript, Go, Rust, Java, C/C++, and more
|
|
376
|
-
- **Memory Layer**: Store and recall project context across sessions
|
|
377
|
-
- **Incremental Indexing**: Only re-index changed files
|
|
378
|
-
- **100% Local**: No API calls, no cloud, works offline
|
|
379
|
-
- **GPU Acceleration**: Optional WebGPU/CUDA support
|
|
149
|
+
Python, JavaScript, TypeScript, Go, Rust, Java, C/C++, C#, Ruby, PHP, Kotlin, Swift, Scala, Lua, Dart, Elixir, Haskell, OCaml, Zig, Perl, Bash, HTML, CSS, Vue, JSON, YAML, TOML, Makefile
|
|
380
150
|
|
|
381
151
|
## Requirements
|
|
382
152
|
|
|
383
153
|
- Node.js >= 20.0.0
|
|
384
|
-
- Docker (for ChromaDB
|
|
385
|
-
- ~500MB disk space
|
|
154
|
+
- Docker (for ChromaDB)
|
|
155
|
+
- ~500MB disk space (embedding model)
|
|
386
156
|
|
|
387
157
|
## Technical Details
|
|
388
158
|
|
|
389
159
|
| Component | Technology |
|
|
390
160
|
|-----------|------------|
|
|
391
161
|
| Embedding Model | `all-MiniLM-L6-v2` (384 dimensions) |
|
|
392
|
-
| Model Runtime | `@huggingface/transformers` (ONNX) |
|
|
393
162
|
| Vector Database | ChromaDB |
|
|
394
163
|
| Code Parser | Tree-sitter |
|
|
395
164
|
| MCP SDK | `@modelcontextprotocol/sdk` |
|
|
396
165
|
|
|
397
|
-
## Development
|
|
398
|
-
|
|
399
|
-
```bash
|
|
400
|
-
npm install # Install dependencies
|
|
401
|
-
npm run dev # Run with tsx (no build needed)
|
|
402
|
-
npm run build # Compile TypeScript
|
|
403
|
-
npm test # Run tests
|
|
404
|
-
```
|
|
405
|
-
|
|
406
166
|
## License
|
|
407
167
|
|
|
408
168
|
MIT
|