codebaxing 0.2.1 → 0.2.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +237 -323
- package/README.vi.md +238 -324
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -3,15 +3,28 @@
|
|
|
3
3
|
[](https://www.npmjs.com/package/codebaxing)
|
|
4
4
|
[](https://opensource.org/licenses/MIT)
|
|
5
5
|
|
|
6
|
+
**[English](README.md)** | [Tiếng Việt](README.vi.md)
|
|
7
|
+
|
|
6
8
|
MCP server for **semantic code search**. Index your codebase and search using natural language queries.
|
|
7
9
|
|
|
10
|
+
## Table of Contents
|
|
11
|
+
|
|
12
|
+
- [The Idea](#the-idea)
|
|
13
|
+
- [Quick Start](#quick-start)
|
|
14
|
+
- [Usage](#usage)
|
|
15
|
+
- [Via AI Agents (MCP)](#via-ai-agents-mcp)
|
|
16
|
+
- [Via CLI (Terminal)](#via-cli-terminal)
|
|
17
|
+
- [Installation](#installation)
|
|
18
|
+
- [How It Works](#how-it-works)
|
|
19
|
+
- [Configuration](#configuration)
|
|
20
|
+
- [Supported Languages](#supported-languages)
|
|
21
|
+
|
|
8
22
|
## The Idea
|
|
9
23
|
|
|
10
24
|
Traditional code search (grep, ripgrep) matches exact text. But developers think in concepts:
|
|
11
25
|
|
|
12
26
|
- *"Where is the authentication logic?"* - not `grep "authentication"`
|
|
13
27
|
- *"Find database connection code"* - not `grep "database"`
|
|
14
|
-
- *"How does error handling work?"* - not `grep "error"`
|
|
15
28
|
|
|
16
29
|
**Codebaxing** bridges this gap using **semantic search**:
|
|
17
30
|
|
|
@@ -22,304 +35,203 @@ Finds: login(), validateCredentials(), checkPassword(), authMiddleware()
|
|
|
22
35
|
(even if they don't contain the word "authentication")
|
|
23
36
|
```
|
|
24
37
|
|
|
25
|
-
##
|
|
38
|
+
## Quick Start
|
|
26
39
|
|
|
27
|
-
###
|
|
40
|
+
### 1. Install to your AI editor
|
|
28
41
|
|
|
29
|
-
```
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
│ Source Files (.py, .ts, .js, .go, .rs, ...) │
|
|
35
|
-
│ │ │
|
|
36
|
-
│ ▼ │
|
|
37
|
-
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
|
38
|
-
│ │ Tree-sitter │───▶│ Symbols │───▶│ Embedding │ │
|
|
39
|
-
│ │ Parser │ │ Extraction │ │ Model │ │
|
|
40
|
-
│ └──────────────┘ └──────────────┘ └──────────────┘ │
|
|
41
|
-
│ │ │ │ │
|
|
42
|
-
│ Parse AST Functions, Text → Vector │
|
|
43
|
-
│ Classes, etc. (384 dimensions) │
|
|
44
|
-
│ │ │
|
|
45
|
-
│ ▼ │
|
|
46
|
-
│ ┌──────────────┐ │
|
|
47
|
-
│ │ ChromaDB │ │
|
|
48
|
-
│ │ (vectors) │ │
|
|
49
|
-
│ └──────────────┘ │
|
|
50
|
-
│ │
|
|
51
|
-
└─────────────────────────────────────────────────────────────────┘
|
|
52
|
-
|
|
53
|
-
┌─────────────────────────────────────────────────────────────────┐
|
|
54
|
-
│ SEARCH │
|
|
55
|
-
├─────────────────────────────────────────────────────────────────┤
|
|
56
|
-
│ │
|
|
57
|
-
│ "find auth code" │
|
|
58
|
-
│ │ │
|
|
59
|
-
│ ▼ │
|
|
60
|
-
│ ┌──────────────┐ ┌──────────────┐ │
|
|
61
|
-
│ │ Embedding │────────▶│ ChromaDB │ │
|
|
62
|
-
│ │ Model │ query │ Query │ │
|
|
63
|
-
│ └──────────────┘ vector └──────────────┘ │
|
|
64
|
-
│ │ │
|
|
65
|
-
│ ▼ │
|
|
66
|
-
│ Cosine Similarity │
|
|
67
|
-
│ │ │
|
|
68
|
-
│ ▼ │
|
|
69
|
-
│ Top-k Results │
|
|
70
|
-
│ (login.py, auth.ts, ...) │
|
|
71
|
-
│ │
|
|
72
|
-
└─────────────────────────────────────────────────────────────────┘
|
|
42
|
+
```bash
|
|
43
|
+
npx codebaxing install # Claude Desktop
|
|
44
|
+
npx codebaxing install --cursor # Cursor
|
|
45
|
+
npx codebaxing install --windsurf # Windsurf
|
|
46
|
+
npx codebaxing install --all # All editors
|
|
73
47
|
```
|
|
74
48
|
|
|
75
|
-
###
|
|
49
|
+
### 2. Restart your editor
|
|
76
50
|
|
|
77
|
-
|
|
51
|
+
### 3. Start using
|
|
78
52
|
|
|
79
|
-
|
|
53
|
+
In Claude Desktop (or Cursor, Windsurf...):
|
|
80
54
|
|
|
81
|
-
```python
|
|
82
|
-
# Input: auth.py
|
|
83
|
-
def login(username, password):
|
|
84
|
-
"""Authenticate user credentials"""
|
|
85
|
-
if validate(username, password):
|
|
86
|
-
return create_session(username)
|
|
87
|
-
raise AuthError("Invalid credentials")
|
|
88
55
|
```
|
|
56
|
+
You: Index my project at /path/to/myproject
|
|
57
|
+
Claude: [calls index tool]
|
|
89
58
|
|
|
59
|
+
You: Find the authentication logic
|
|
60
|
+
Claude: [calls search tool, returns relevant code]
|
|
90
61
|
```
|
|
91
|
-
# Output: Symbol
|
|
92
|
-
{
|
|
93
|
-
name: "login",
|
|
94
|
-
type: "function",
|
|
95
|
-
signature: "def login(username, password)",
|
|
96
|
-
code: "def login(username, password):...",
|
|
97
|
-
filepath: "auth.py",
|
|
98
|
-
lineStart: 1,
|
|
99
|
-
lineEnd: 6
|
|
100
|
-
}
|
|
101
|
-
```
|
|
102
62
|
|
|
103
|
-
|
|
63
|
+
## Usage
|
|
64
|
+
|
|
65
|
+
### Via AI Agents (MCP)
|
|
66
|
+
|
|
67
|
+
After installing to your AI editor, you interact through natural conversation:
|
|
104
68
|
|
|
105
|
-
|
|
69
|
+
#### Step 1: Index your codebase (Required first)
|
|
106
70
|
|
|
107
71
|
```
|
|
108
|
-
|
|
109
|
-
↓
|
|
110
|
-
Embedding Model (runs locally, ONNX)
|
|
111
|
-
↓
|
|
112
|
-
[0.12, -0.34, 0.56, 0.08, ..., -0.22] (384 numbers)
|
|
72
|
+
You: Index the codebase at /Users/me/projects/myapp
|
|
113
73
|
```
|
|
114
74
|
|
|
115
|
-
|
|
116
|
-
- `"authentication"` ≈ `"login"` ≈ `"credentials"` (vectors are close)
|
|
117
|
-
- `"database"` ≈ `"query"` ≈ `"SQL"` (vectors are close)
|
|
118
|
-
- `"authentication"` ≠ `"database"` (vectors are far apart)
|
|
75
|
+
Claude will call `index(path="/Users/me/projects/myapp")` and show progress.
|
|
119
76
|
|
|
120
|
-
|
|
77
|
+
> **Note:** First run downloads the embedding model (~90MB), takes 1-2 minutes.
|
|
121
78
|
|
|
122
|
-
|
|
79
|
+
#### Step 2: Search for code
|
|
123
80
|
|
|
124
81
|
```
|
|
125
|
-
|
|
126
|
-
|
|
127
|
-
|
|
128
|
-
├─────────────────────────────────────────────────────┤
|
|
129
|
-
│ chunk_001 │ [0.12, -0.34, ...] │ {file: auth.py} │
|
|
130
|
-
│ chunk_002 │ [0.45, 0.23, ...] │ {file: db.py} │
|
|
131
|
-
│ chunk_003 │ [-0.11, 0.67, ...] │ {file: api.ts} │
|
|
132
|
-
│ ... │ ... │ ... │
|
|
133
|
-
└─────────────────────────────────────────────────────┘
|
|
82
|
+
You: Find code that handles user authentication
|
|
83
|
+
You: Where is the database connection logic?
|
|
84
|
+
You: Show me error handling patterns
|
|
134
85
|
```
|
|
135
86
|
|
|
136
|
-
####
|
|
137
|
-
|
|
138
|
-
When you search, your query is embedded and compared against all stored vectors:
|
|
87
|
+
#### Step 3: Use memory (optional)
|
|
139
88
|
|
|
140
89
|
```
|
|
141
|
-
|
|
142
|
-
|
|
143
|
-
Query Vector: [0.15, -0.31, 0.52, ...]
|
|
144
|
-
↓
|
|
145
|
-
Compare with all vectors using cosine similarity:
|
|
146
|
-
- chunk_001 (login): similarity = 0.89 ← HIGH
|
|
147
|
-
- chunk_002 (db): similarity = 0.23 ← LOW
|
|
148
|
-
- chunk_003 (auth): similarity = 0.85 ← HIGH
|
|
149
|
-
↓
|
|
150
|
-
Return top-k most similar chunks
|
|
90
|
+
You: Remember that we're using PostgreSQL with Prisma ORM
|
|
91
|
+
You: What decisions have we made about the database?
|
|
151
92
|
```
|
|
152
93
|
|
|
153
|
-
|
|
94
|
+
#### MCP Tools Reference
|
|
154
95
|
|
|
155
|
-
|
|
96
|
+
| Tool | Description | Example |
|
|
97
|
+
|------|-------------|---------|
|
|
98
|
+
| `index` | Index a codebase (**required first**) | `index(path="/project")` |
|
|
99
|
+
| `search` | Semantic code search | `search(question="auth middleware")` |
|
|
100
|
+
| `stats` | Index statistics | `stats()` |
|
|
101
|
+
| `languages` | Supported extensions | `languages()` |
|
|
102
|
+
| `remember` | Store memory | `remember(content="Using Redis", memory_type="decision")` |
|
|
103
|
+
| `recall` | Retrieve memories | `recall(query="database")` |
|
|
104
|
+
| `forget` | Delete memories | `forget(memory_type="note")` |
|
|
105
|
+
| `memory-stats` | Memory statistics | `memory-stats()` |
|
|
156
106
|
|
|
157
|
-
|
|
158
|
-
|-----------|--------------|----------|
|
|
159
|
-
| authentication | login, credentials, auth, signin | Close |
|
|
160
|
-
| database | query, SQL, connection, ORM | Close |
|
|
161
|
-
| error | exception, failure, catch, throw | Close |
|
|
162
|
-
| parse | tokenize, lexer, AST, syntax | Close |
|
|
107
|
+
### Via CLI (Terminal)
|
|
163
108
|
|
|
164
|
-
|
|
109
|
+
You can use Codebaxing directly from terminal without AI agents:
|
|
165
110
|
|
|
166
|
-
|
|
167
|
-
|
|
168
|
-
- **Semantic Code Search**: Find code by describing what you're looking for
|
|
169
|
-
- **24+ Languages**: Python, TypeScript, JavaScript, Go, Rust, Java, C/C++, and more
|
|
170
|
-
- **Memory Layer**: Store and recall project context across sessions
|
|
171
|
-
- **Incremental Indexing**: Only re-index changed files
|
|
172
|
-
- **100% Local**: No API calls, no cloud, works offline
|
|
173
|
-
- **GPU Acceleration**: Optional WebGPU/CUDA support
|
|
174
|
-
|
|
175
|
-
## Requirements
|
|
176
|
-
|
|
177
|
-
- Node.js >= 20.0.0
|
|
178
|
-
- ~500MB disk space for embedding model (downloaded on first run)
|
|
179
|
-
|
|
180
|
-
## Installation
|
|
181
|
-
|
|
182
|
-
### Quick Install (Recommended)
|
|
111
|
+
#### Step 1: Index your codebase (Required first)
|
|
183
112
|
|
|
184
113
|
```bash
|
|
185
|
-
|
|
186
|
-
npx codebaxing install
|
|
187
|
-
|
|
188
|
-
# Install to Cursor
|
|
189
|
-
npx codebaxing install --cursor
|
|
190
|
-
|
|
191
|
-
# Install to Windsurf
|
|
192
|
-
npx codebaxing install --windsurf
|
|
193
|
-
|
|
194
|
-
# Install to all supported editors
|
|
195
|
-
npx codebaxing install --all
|
|
114
|
+
npx codebaxing index /path/to/project
|
|
196
115
|
```
|
|
197
116
|
|
|
198
|
-
|
|
199
|
-
|
|
200
|
-
|
|
201
|
-
|
|
202
|
-
|
|
203
|
-
|
|
204
|
-
|
|
117
|
+
Output:
|
|
118
|
+
```
|
|
119
|
+
🔧 Codebaxing - Index Codebase
|
|
120
|
+
|
|
121
|
+
📁 Path: /path/to/project
|
|
122
|
+
|
|
123
|
+
================================================================================
|
|
124
|
+
INDEXING CODEBASE
|
|
125
|
+
================================================================================
|
|
126
|
+
Found 47 files
|
|
127
|
+
Parsed 645 symbols from 47 files
|
|
128
|
+
Generating embeddings for 645 chunks...
|
|
129
|
+
Model loaded: Xenova/all-MiniLM-L6-v2 (384 dims, CPU)
|
|
130
|
+
|
|
131
|
+
================================================================================
|
|
132
|
+
INDEXING COMPLETE
|
|
133
|
+
================================================================================
|
|
134
|
+
Files parsed: 47
|
|
135
|
+
Symbols extracted: 645
|
|
136
|
+
Chunks created: 645
|
|
137
|
+
Time elapsed: 21.9s
|
|
205
138
|
```
|
|
206
139
|
|
|
207
|
-
|
|
208
|
-
|
|
209
|
-
You can also use Codebaxing directly from terminal without AI agents:
|
|
140
|
+
#### Step 2: Search for code
|
|
210
141
|
|
|
211
142
|
```bash
|
|
212
|
-
# Index a codebase
|
|
213
|
-
npx codebaxing index /path/to/project
|
|
214
|
-
|
|
215
|
-
# Search for code
|
|
216
143
|
npx codebaxing search "authentication middleware"
|
|
217
144
|
npx codebaxing search "database connection" --path ./src --limit 10
|
|
218
|
-
|
|
219
|
-
# Show index statistics
|
|
220
|
-
npx codebaxing stats /path/to/project
|
|
221
145
|
```
|
|
222
146
|
|
|
223
|
-
|
|
224
|
-
```bash
|
|
225
|
-
docker run -d -p 8000:8000 chromadb/chroma
|
|
226
|
-
export CHROMADB_URL=http://localhost:8000
|
|
147
|
+
Output:
|
|
227
148
|
```
|
|
149
|
+
🔧 Codebaxing - Search
|
|
228
150
|
|
|
229
|
-
|
|
151
|
+
📁 Path: /path/to/project
|
|
152
|
+
🔍 Query: "authentication middleware"
|
|
153
|
+
📊 Limit: 5
|
|
230
154
|
|
|
231
|
-
|
|
155
|
+
────────────────────────────────────────────────────────────
|
|
156
|
+
Results:
|
|
232
157
|
|
|
233
|
-
|
|
158
|
+
1. src/middleware/auth.ts:15 - authMiddleware()
|
|
159
|
+
2. src/services/auth.ts:42 - validateToken()
|
|
160
|
+
3. src/routes/login.ts:8 - loginHandler()
|
|
234
161
|
|
|
235
|
-
|
|
162
|
+
────────────────────────────────────────────────────────────
|
|
163
|
+
```
|
|
236
164
|
|
|
237
|
-
|
|
165
|
+
#### Step 3: Check statistics
|
|
238
166
|
|
|
239
167
|
```bash
|
|
240
|
-
|
|
241
|
-
docker run -d -p 8000:8000 chromadb/chroma
|
|
242
|
-
|
|
243
|
-
# Set environment variable
|
|
244
|
-
export CHROMADB_URL=http://localhost:8000
|
|
168
|
+
npx codebaxing stats /path/to/project
|
|
245
169
|
```
|
|
246
170
|
|
|
247
|
-
|
|
171
|
+
#### CLI Commands Reference
|
|
248
172
|
|
|
249
|
-
|
|
173
|
+
| Command | Description |
|
|
174
|
+
|---------|-------------|
|
|
175
|
+
| `npx codebaxing install [--editor]` | Install MCP server to AI editor |
|
|
176
|
+
| `npx codebaxing uninstall [--editor]` | Uninstall MCP server |
|
|
177
|
+
| `npx codebaxing index <path>` | Index a codebase |
|
|
178
|
+
| `npx codebaxing search <query> [options]` | Search indexed codebase |
|
|
179
|
+
| `npx codebaxing stats [path]` | Show index statistics |
|
|
180
|
+
| `npx codebaxing --help` | Show help |
|
|
250
181
|
|
|
251
|
-
|
|
182
|
+
**Search options:**
|
|
183
|
+
- `--path, -p <path>` - Codebase path (default: current directory)
|
|
184
|
+
- `--limit, -n <number>` - Number of results (default: 5)
|
|
252
185
|
|
|
253
|
-
|
|
254
|
-
**Windows**: `%APPDATA%\Claude\claude_desktop_config.json`
|
|
186
|
+
## Installation
|
|
255
187
|
|
|
256
|
-
|
|
188
|
+
### Option 1: Quick Install (Recommended)
|
|
257
189
|
|
|
258
|
-
```
|
|
259
|
-
|
|
260
|
-
|
|
261
|
-
|
|
262
|
-
|
|
263
|
-
|
|
264
|
-
}
|
|
265
|
-
}
|
|
266
|
-
}
|
|
190
|
+
```bash
|
|
191
|
+
npx codebaxing install # Claude Desktop (default)
|
|
192
|
+
npx codebaxing install --cursor # Cursor
|
|
193
|
+
npx codebaxing install --windsurf # Windsurf (Codeium)
|
|
194
|
+
npx codebaxing install --zed # Zed
|
|
195
|
+
npx codebaxing install --all # All supported editors
|
|
267
196
|
```
|
|
268
197
|
|
|
269
|
-
|
|
198
|
+
Then restart your editor.
|
|
270
199
|
|
|
271
|
-
|
|
272
|
-
npm install -g codebaxing
|
|
273
|
-
```
|
|
200
|
+
### Option 2: Manual Configuration
|
|
274
201
|
|
|
275
|
-
|
|
276
|
-
{
|
|
277
|
-
"mcpServers": {
|
|
278
|
-
"codebaxing": {
|
|
279
|
-
"command": "codebaxing"
|
|
280
|
-
}
|
|
281
|
-
}
|
|
282
|
-
}
|
|
283
|
-
```
|
|
202
|
+
#### Claude Desktop
|
|
284
203
|
|
|
285
|
-
|
|
204
|
+
Add to `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS) or `%APPDATA%\Claude\claude_desktop_config.json` (Windows):
|
|
286
205
|
|
|
287
206
|
```json
|
|
288
207
|
{
|
|
289
208
|
"mcpServers": {
|
|
290
209
|
"codebaxing": {
|
|
291
210
|
"command": "npx",
|
|
292
|
-
"args": ["-y", "codebaxing"]
|
|
293
|
-
"env": {
|
|
294
|
-
"CHROMADB_URL": "http://localhost:8000"
|
|
295
|
-
}
|
|
211
|
+
"args": ["-y", "codebaxing"]
|
|
296
212
|
}
|
|
297
213
|
}
|
|
298
214
|
}
|
|
299
215
|
```
|
|
300
216
|
|
|
301
|
-
####
|
|
217
|
+
#### Cursor
|
|
218
|
+
|
|
219
|
+
Add to `~/.cursor/mcp.json`:
|
|
302
220
|
|
|
303
221
|
```json
|
|
304
222
|
{
|
|
305
223
|
"mcpServers": {
|
|
306
224
|
"codebaxing": {
|
|
307
|
-
"command": "
|
|
308
|
-
"args": ["
|
|
225
|
+
"command": "npx",
|
|
226
|
+
"args": ["-y", "codebaxing"]
|
|
309
227
|
}
|
|
310
228
|
}
|
|
311
229
|
}
|
|
312
230
|
```
|
|
313
231
|
|
|
314
|
-
|
|
315
|
-
|
|
316
|
-
The Codebaxing tools will now be available in Claude.
|
|
317
|
-
|
|
318
|
-
### Other AI Agents Integration
|
|
232
|
+
#### Windsurf
|
|
319
233
|
|
|
320
|
-
|
|
321
|
-
|
|
322
|
-
Add to Cursor settings (`~/.cursor/mcp.json`):
|
|
234
|
+
Add to `~/.codeium/windsurf/mcp_config.json`:
|
|
323
235
|
|
|
324
236
|
```json
|
|
325
237
|
{
|
|
@@ -332,16 +244,18 @@ Add to Cursor settings (`~/.cursor/mcp.json`):
|
|
|
332
244
|
}
|
|
333
245
|
```
|
|
334
246
|
|
|
335
|
-
####
|
|
247
|
+
#### Zed
|
|
336
248
|
|
|
337
|
-
Add to
|
|
249
|
+
Add to `~/.config/zed/settings.json`:
|
|
338
250
|
|
|
339
251
|
```json
|
|
340
252
|
{
|
|
341
|
-
"
|
|
253
|
+
"context_servers": {
|
|
342
254
|
"codebaxing": {
|
|
343
|
-
"command":
|
|
344
|
-
|
|
255
|
+
"command": {
|
|
256
|
+
"path": "npx",
|
|
257
|
+
"args": ["-y", "codebaxing"]
|
|
258
|
+
}
|
|
345
259
|
}
|
|
346
260
|
}
|
|
347
261
|
}
|
|
@@ -349,7 +263,7 @@ Add to Windsurf MCP config (`~/.codeium/windsurf/mcp_config.json`):
|
|
|
349
263
|
|
|
350
264
|
#### VS Code + Continue
|
|
351
265
|
|
|
352
|
-
Add to
|
|
266
|
+
Add to `~/.continue/config.json`:
|
|
353
267
|
|
|
354
268
|
```json
|
|
355
269
|
{
|
|
@@ -367,79 +281,67 @@ Add to Continue config (`~/.continue/config.json`):
|
|
|
367
281
|
}
|
|
368
282
|
```
|
|
369
283
|
|
|
370
|
-
|
|
371
|
-
|
|
372
|
-
Add to Zed settings (`~/.config/zed/settings.json`):
|
|
284
|
+
### Uninstall
|
|
373
285
|
|
|
374
|
-
```
|
|
375
|
-
|
|
376
|
-
|
|
377
|
-
"codebaxing": {
|
|
378
|
-
"command": {
|
|
379
|
-
"path": "npx",
|
|
380
|
-
"args": ["-y", "codebaxing"]
|
|
381
|
-
}
|
|
382
|
-
}
|
|
383
|
-
}
|
|
384
|
-
}
|
|
286
|
+
```bash
|
|
287
|
+
npx codebaxing uninstall # Claude Desktop
|
|
288
|
+
npx codebaxing uninstall --all # All editors
|
|
385
289
|
```
|
|
386
290
|
|
|
387
|
-
|
|
388
|
-
|
|
389
|
-
For any MCP-compatible client, use stdio transport:
|
|
291
|
+
## How It Works
|
|
390
292
|
|
|
391
|
-
|
|
392
|
-
# Command
|
|
393
|
-
npx -y codebaxing
|
|
293
|
+
### Architecture
|
|
394
294
|
|
|
395
|
-
# Or if installed globally
|
|
396
|
-
codebaxing
|
|
397
295
|
```
|
|
296
|
+
┌─────────────────────────────────────────────────────────────────┐
|
|
297
|
+
│ INDEXING │
|
|
298
|
+
├─────────────────────────────────────────────────────────────────┤
|
|
299
|
+
│ Source Files (.py, .ts, .js, .go, .rs, ...) │
|
|
300
|
+
│ │ │
|
|
301
|
+
│ ▼ │
|
|
302
|
+
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
|
303
|
+
│ │ Tree-sitter │───▶│ Symbols │───▶│ Embedding │ │
|
|
304
|
+
│ │ Parser │ │ Extraction │ │ Model │ │
|
|
305
|
+
│ └──────────────┘ └──────────────┘ └──────────────┘ │
|
|
306
|
+
│ │ │ │ │
|
|
307
|
+
│ Parse AST Functions, Text → Vector │
|
|
308
|
+
│ Classes, etc. (384 dimensions) │
|
|
309
|
+
│ │ │
|
|
310
|
+
│ ▼ │
|
|
311
|
+
│ ┌──────────────┐ │
|
|
312
|
+
│ │ ChromaDB │ │
|
|
313
|
+
│ │ (vectors) │ │
|
|
314
|
+
│ └──────────────┘ │
|
|
315
|
+
└─────────────────────────────────────────────────────────────────┘
|
|
398
316
|
|
|
399
|
-
|
|
317
|
+
┌─────────────────────────────────────────────────────────────────┐
|
|
318
|
+
│ SEARCH │
|
|
319
|
+
├─────────────────────────────────────────────────────────────────┤
|
|
320
|
+
│ "find auth code" │
|
|
321
|
+
│ │ │
|
|
322
|
+
│ ▼ │
|
|
323
|
+
│ ┌──────────────┐ ┌──────────────┐ │
|
|
324
|
+
│ │ Embedding │────────▶│ ChromaDB │ │
|
|
325
|
+
│ │ Model │ query │ Query │ │
|
|
326
|
+
│ └──────────────┘ vector └──────────────┘ │
|
|
327
|
+
│ │ │
|
|
328
|
+
│ ▼ │
|
|
329
|
+
│ Cosine Similarity │
|
|
330
|
+
│ │ │
|
|
331
|
+
│ ▼ │
|
|
332
|
+
│ Top-k Results │
|
|
333
|
+
└─────────────────────────────────────────────────────────────────┘
|
|
334
|
+
```
|
|
400
335
|
|
|
401
|
-
###
|
|
402
|
-
|
|
403
|
-
| Tool | Description |
|
|
404
|
-
|------|-------------|
|
|
405
|
-
| `index` | Index a codebase. Modes: `auto` (incremental), `full`, `load-only` |
|
|
406
|
-
| `search` | Semantic search. Returns ranked code chunks |
|
|
407
|
-
| `stats` | Index statistics (files, symbols, chunks) |
|
|
408
|
-
| `languages` | List supported file extensions |
|
|
409
|
-
| `remember` | Store memories (conversation, status, decision, preference, doc, note) |
|
|
410
|
-
| `recall` | Semantic search over memories |
|
|
411
|
-
| `forget` | Delete memories by ID, type, tags, or age |
|
|
412
|
-
| `memory-stats` | Memory statistics by type |
|
|
413
|
-
|
|
414
|
-
### Example Workflow
|
|
415
|
-
|
|
416
|
-
1. **Index your codebase:**
|
|
417
|
-
```
|
|
418
|
-
index(path="/path/to/your/project")
|
|
419
|
-
```
|
|
420
|
-
|
|
421
|
-
2. **Search for code:**
|
|
422
|
-
```
|
|
423
|
-
search(question="authentication middleware")
|
|
424
|
-
search(question="database connection", language="typescript")
|
|
425
|
-
search(question="error handling", symbol_type="function")
|
|
426
|
-
```
|
|
427
|
-
|
|
428
|
-
3. **Store context:**
|
|
429
|
-
```
|
|
430
|
-
remember(content="Using PostgreSQL with Prisma ORM", memory_type="decision")
|
|
431
|
-
remember(content="Auth uses JWT tokens", memory_type="doc", tags=["auth", "security"])
|
|
432
|
-
```
|
|
433
|
-
|
|
434
|
-
4. **Recall context:**
|
|
435
|
-
```
|
|
436
|
-
recall(query="database setup")
|
|
437
|
-
recall(query="authentication", memory_type="decision")
|
|
438
|
-
```
|
|
336
|
+
### Why Semantic Search Works
|
|
439
337
|
|
|
440
|
-
|
|
338
|
+
The embedding model understands that:
|
|
441
339
|
|
|
442
|
-
|
|
340
|
+
| Query | Finds (even without exact match) |
|
|
341
|
+
|-------|----------------------------------|
|
|
342
|
+
| "authentication" | login, credentials, auth, signin, validateUser |
|
|
343
|
+
| "database" | query, SQL, connection, ORM, repository |
|
|
344
|
+
| "error handling" | try/catch, exception, throw, ErrorBoundary |
|
|
443
345
|
|
|
444
346
|
## Configuration
|
|
445
347
|
|
|
@@ -448,62 +350,65 @@ Python, JavaScript, TypeScript, C, C++, Bash, Go, Java, Kotlin, Rust, Ruby, C#,
|
|
|
448
350
|
| Variable | Description | Default |
|
|
449
351
|
|----------|-------------|---------|
|
|
450
352
|
| `CHROMADB_URL` | ChromaDB server URL for persistent storage | (in-memory) |
|
|
451
|
-
| `CODEBAXING_DEVICE` | Compute device
|
|
353
|
+
| `CODEBAXING_DEVICE` | Compute device: `cpu`, `webgpu`, `cuda`, `auto` | `cpu` |
|
|
452
354
|
|
|
453
|
-
###
|
|
355
|
+
### Persistent Storage
|
|
356
|
+
|
|
357
|
+
By default, the index is stored in memory and lost when the server restarts.
|
|
454
358
|
|
|
455
|
-
|
|
359
|
+
For persistent storage:
|
|
456
360
|
|
|
457
361
|
```bash
|
|
458
|
-
#
|
|
459
|
-
|
|
460
|
-
|
|
461
|
-
# Auto-detect best device
|
|
462
|
-
export CODEBAXING_DEVICE=auto
|
|
362
|
+
# Start ChromaDB
|
|
363
|
+
docker run -d -p 8000:8000 chromadb/chroma
|
|
463
364
|
|
|
464
|
-
#
|
|
465
|
-
export
|
|
365
|
+
# Set environment variable
|
|
366
|
+
export CHROMADB_URL=http://localhost:8000
|
|
466
367
|
```
|
|
467
368
|
|
|
468
|
-
|
|
469
|
-
|
|
470
|
-
**Note:** macOS does not support CUDA (no NVIDIA drivers). Use `webgpu` for GPU acceleration on Mac.
|
|
369
|
+
Or in MCP config:
|
|
471
370
|
|
|
472
|
-
|
|
473
|
-
|
|
474
|
-
|
|
475
|
-
|
|
371
|
+
```json
|
|
372
|
+
{
|
|
373
|
+
"mcpServers": {
|
|
374
|
+
"codebaxing": {
|
|
375
|
+
"command": "npx",
|
|
376
|
+
"args": ["-y", "codebaxing"],
|
|
377
|
+
"env": {
|
|
378
|
+
"CHROMADB_URL": "http://localhost:8000"
|
|
379
|
+
}
|
|
380
|
+
}
|
|
381
|
+
}
|
|
382
|
+
}
|
|
383
|
+
```
|
|
476
384
|
|
|
477
|
-
|
|
385
|
+
### GPU Acceleration
|
|
478
386
|
|
|
479
387
|
```bash
|
|
480
|
-
|
|
481
|
-
|
|
482
|
-
|
|
483
|
-
npm test # Run tests
|
|
484
|
-
npm run typecheck # Type check without emitting
|
|
388
|
+
export CODEBAXING_DEVICE=webgpu # macOS (Metal)
|
|
389
|
+
export CODEBAXING_DEVICE=cuda # Linux/Windows (NVIDIA)
|
|
390
|
+
export CODEBAXING_DEVICE=auto # Auto-detect
|
|
485
391
|
```
|
|
486
392
|
|
|
487
|
-
|
|
393
|
+
**Note:** macOS does not support CUDA. Use `webgpu` for GPU acceleration on Mac.
|
|
488
394
|
|
|
489
|
-
|
|
490
|
-
# Run unit tests
|
|
491
|
-
npm test
|
|
395
|
+
## Supported Languages
|
|
492
396
|
|
|
493
|
-
|
|
494
|
-
|
|
495
|
-
|
|
397
|
+
Python, JavaScript, TypeScript, C, C++, Bash, Go, Java, Kotlin, Rust, Ruby, C#, PHP, Scala, Swift, Lua, Dart, Elixir, Haskell, OCaml, Zig, Perl, CSS, HTML, Vue, JSON, YAML, TOML, Makefile
|
|
398
|
+
|
|
399
|
+
## Features
|
|
496
400
|
|
|
497
|
-
|
|
401
|
+
- **Semantic Code Search**: Find code by describing what you're looking for
|
|
402
|
+
- **24+ Languages**: Python, TypeScript, JavaScript, Go, Rust, Java, C/C++, and more
|
|
403
|
+
- **Memory Layer**: Store and recall project context across sessions
|
|
404
|
+
- **Incremental Indexing**: Only re-index changed files
|
|
405
|
+
- **100% Local**: No API calls, no cloud, works offline
|
|
406
|
+
- **GPU Acceleration**: Optional WebGPU/CUDA support
|
|
498
407
|
|
|
499
|
-
|
|
500
|
-
|
|
501
|
-
|
|
502
|
-
|
|
503
|
-
| Understands context | No | Yes |
|
|
504
|
-
| Finds synonyms | No | Yes |
|
|
505
|
-
| Speed | Very fast | Fast (after indexing) |
|
|
506
|
-
| Setup | None | Requires indexing |
|
|
408
|
+
## Requirements
|
|
409
|
+
|
|
410
|
+
- Node.js >= 20.0.0
|
|
411
|
+
- ~500MB disk space for embedding model (downloaded on first run)
|
|
507
412
|
|
|
508
413
|
## Technical Details
|
|
509
414
|
|
|
@@ -515,6 +420,15 @@ CHROMADB_URL=http://localhost:8000 npx tsx test-indexing.ts
|
|
|
515
420
|
| Code Parser | Tree-sitter |
|
|
516
421
|
| MCP SDK | `@modelcontextprotocol/sdk` |
|
|
517
422
|
|
|
423
|
+
## Development
|
|
424
|
+
|
|
425
|
+
```bash
|
|
426
|
+
npm install # Install dependencies
|
|
427
|
+
npm run dev # Run with tsx (no build needed)
|
|
428
|
+
npm run build # Compile TypeScript
|
|
429
|
+
npm test # Run tests
|
|
430
|
+
```
|
|
431
|
+
|
|
518
432
|
## License
|
|
519
433
|
|
|
520
434
|
MIT
|