codebaxing 0.2.1 → 0.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +237 -323
  2. package/README.vi.md +238 -324
  3. package/package.json +1 -1
package/README.md CHANGED
@@ -3,15 +3,28 @@
3
3
  [![npm version](https://img.shields.io/npm/v/codebaxing.svg)](https://www.npmjs.com/package/codebaxing)
4
4
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
5
5
 
6
+ **[English](README.md)** | [Tiếng Việt](README.vi.md)
7
+
6
8
  MCP server for **semantic code search**. Index your codebase and search using natural language queries.
7
9
 
10
+ ## Table of Contents
11
+
12
+ - [The Idea](#the-idea)
13
+ - [Quick Start](#quick-start)
14
+ - [Usage](#usage)
15
+ - [Via AI Agents (MCP)](#via-ai-agents-mcp)
16
+ - [Via CLI (Terminal)](#via-cli-terminal)
17
+ - [Installation](#installation)
18
+ - [How It Works](#how-it-works)
19
+ - [Configuration](#configuration)
20
+ - [Supported Languages](#supported-languages)
21
+
8
22
  ## The Idea
9
23
 
10
24
  Traditional code search (grep, ripgrep) matches exact text. But developers think in concepts:
11
25
 
12
26
  - *"Where is the authentication logic?"* - not `grep "authentication"`
13
27
  - *"Find database connection code"* - not `grep "database"`
14
- - *"How does error handling work?"* - not `grep "error"`
15
28
 
16
29
  **Codebaxing** bridges this gap using **semantic search**:
17
30
 
@@ -22,304 +35,203 @@ Finds: login(), validateCredentials(), checkPassword(), authMiddleware()
22
35
  (even if they don't contain the word "authentication")
23
36
  ```
24
37
 
25
- ## How It Works
38
+ ## Quick Start
26
39
 
27
- ### Architecture Overview
40
+ ### 1. Install to your AI editor
28
41
 
29
- ```
30
- ┌─────────────────────────────────────────────────────────────────┐
31
- │ INDEXING │
32
- ├─────────────────────────────────────────────────────────────────┤
33
- │ │
34
- │ Source Files (.py, .ts, .js, .go, .rs, ...) │
35
- │ │ │
36
- │ ▼ │
37
- │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
38
- │ │ Tree-sitter │───▶│ Symbols │───▶│ Embedding │ │
39
- │ │ Parser │ │ Extraction │ │ Model │ │
40
- │ └──────────────┘ └──────────────┘ └──────────────┘ │
41
- │ │ │ │ │
42
- │ Parse AST Functions, Text → Vector │
43
- │ Classes, etc. (384 dimensions) │
44
- │ │ │
45
- │ ▼ │
46
- │ ┌──────────────┐ │
47
- │ │ ChromaDB │ │
48
- │ │ (vectors) │ │
49
- │ └──────────────┘ │
50
- │ │
51
- └─────────────────────────────────────────────────────────────────┘
52
-
53
- ┌─────────────────────────────────────────────────────────────────┐
54
- │ SEARCH │
55
- ├─────────────────────────────────────────────────────────────────┤
56
- │ │
57
- │ "find auth code" │
58
- │ │ │
59
- │ ▼ │
60
- │ ┌──────────────┐ ┌──────────────┐ │
61
- │ │ Embedding │────────▶│ ChromaDB │ │
62
- │ │ Model │ query │ Query │ │
63
- │ └──────────────┘ vector └──────────────┘ │
64
- │ │ │
65
- │ ▼ │
66
- │ Cosine Similarity │
67
- │ │ │
68
- │ ▼ │
69
- │ Top-k Results │
70
- │ (login.py, auth.ts, ...) │
71
- │ │
72
- └─────────────────────────────────────────────────────────────────┘
42
+ ```bash
43
+ npx codebaxing install # Claude Desktop
44
+ npx codebaxing install --cursor # Cursor
45
+ npx codebaxing install --windsurf # Windsurf
46
+ npx codebaxing install --all # All editors
73
47
  ```
74
48
 
75
- ### Step-by-Step Process
49
+ ### 2. Restart your editor
76
50
 
77
- #### 1. Parsing (Tree-sitter)
51
+ ### 3. Start using
78
52
 
79
- Tree-sitter parses source code into an Abstract Syntax Tree (AST), extracting meaningful symbols:
53
+ In Claude Desktop (or Cursor, Windsurf...):
80
54
 
81
- ```python
82
- # Input: auth.py
83
- def login(username, password):
84
- """Authenticate user credentials"""
85
- if validate(username, password):
86
- return create_session(username)
87
- raise AuthError("Invalid credentials")
88
55
  ```
56
+ You: Index my project at /path/to/myproject
57
+ Claude: [calls index tool]
89
58
 
59
+ You: Find the authentication logic
60
+ Claude: [calls search tool, returns relevant code]
90
61
  ```
91
- # Output: Symbol
92
- {
93
- name: "login",
94
- type: "function",
95
- signature: "def login(username, password)",
96
- code: "def login(username, password):...",
97
- filepath: "auth.py",
98
- lineStart: 1,
99
- lineEnd: 6
100
- }
101
- ```
102
62
 
103
- #### 2. Embedding (all-MiniLM-L6-v2)
63
+ ## Usage
64
+
65
+ ### Via AI Agents (MCP)
66
+
67
+ After installing to your AI editor, you interact through natural conversation:
104
68
 
105
- Each code chunk is converted to a 384-dimensional vector using a neural network:
69
+ #### Step 1: Index your codebase (Required first)
106
70
 
107
71
  ```
108
- "def login(username, password): authenticate user..."
109
-
110
- Embedding Model (runs locally, ONNX)
111
-
112
- [0.12, -0.34, 0.56, 0.08, ..., -0.22] (384 numbers)
72
+ You: Index the codebase at /Users/me/projects/myapp
113
73
  ```
114
74
 
115
- The model understands semantic relationships:
116
- - `"authentication"` ≈ `"login"` ≈ `"credentials"` (vectors are close)
117
- - `"database"` ≈ `"query"` ≈ `"SQL"` (vectors are close)
118
- - `"authentication"` ≠ `"database"` (vectors are far apart)
75
+ Claude will call `index(path="/Users/me/projects/myapp")` and show progress.
119
76
 
120
- #### 3. Storage (ChromaDB)
77
+ > **Note:** First run downloads the embedding model (~90MB), takes 1-2 minutes.
121
78
 
122
- Vectors are stored in ChromaDB, a vector database optimized for similarity search:
79
+ #### Step 2: Search for code
123
80
 
124
81
  ```
125
- ChromaDB Collection:
126
- ┌─────────────────────────────────────────────────────┐
127
- ID │ Vector (384d) │ Metadata │
128
- ├─────────────────────────────────────────────────────┤
129
- │ chunk_001 │ [0.12, -0.34, ...] │ {file: auth.py} │
130
- │ chunk_002 │ [0.45, 0.23, ...] │ {file: db.py} │
131
- │ chunk_003 │ [-0.11, 0.67, ...] │ {file: api.ts} │
132
- │ ... │ ... │ ... │
133
- └─────────────────────────────────────────────────────┘
82
+ You: Find code that handles user authentication
83
+ You: Where is the database connection logic?
84
+ You: Show me error handling patterns
134
85
  ```
135
86
 
136
- #### 4. Search (Cosine Similarity)
137
-
138
- When you search, your query is embedded and compared against all stored vectors:
87
+ #### Step 3: Use memory (optional)
139
88
 
140
89
  ```
141
- Query: "user authentication"
142
-
143
- Query Vector: [0.15, -0.31, 0.52, ...]
144
-
145
- Compare with all vectors using cosine similarity:
146
- - chunk_001 (login): similarity = 0.89 ← HIGH
147
- - chunk_002 (db): similarity = 0.23 ← LOW
148
- - chunk_003 (auth): similarity = 0.85 ← HIGH
149
-
150
- Return top-k most similar chunks
90
+ You: Remember that we're using PostgreSQL with Prisma ORM
91
+ You: What decisions have we made about the database?
151
92
  ```
152
93
 
153
- ### Why Semantic Search Works
94
+ #### MCP Tools Reference
154
95
 
155
- The embedding model was trained on millions of text pairs, learning that:
96
+ | Tool | Description | Example |
97
+ |------|-------------|---------|
98
+ | `index` | Index a codebase (**required first**) | `index(path="/project")` |
99
+ | `search` | Semantic code search | `search(question="auth middleware")` |
100
+ | `stats` | Index statistics | `stats()` |
101
+ | `languages` | Supported extensions | `languages()` |
102
+ | `remember` | Store memory | `remember(content="Using Redis", memory_type="decision")` |
103
+ | `recall` | Retrieve memories | `recall(query="database")` |
104
+ | `forget` | Delete memories | `forget(memory_type="note")` |
105
+ | `memory-stats` | Memory statistics | `memory-stats()` |
156
106
 
157
- | Concept A | ≈ Similar To | Distance |
158
- |-----------|--------------|----------|
159
- | authentication | login, credentials, auth, signin | Close |
160
- | database | query, SQL, connection, ORM | Close |
161
- | error | exception, failure, catch, throw | Close |
162
- | parse | tokenize, lexer, AST, syntax | Close |
107
+ ### Via CLI (Terminal)
163
108
 
164
- This allows finding code by **meaning**, not just keywords.
109
+ You can use Codebaxing directly from terminal without AI agents:
165
110
 
166
- ## Features
167
-
168
- - **Semantic Code Search**: Find code by describing what you're looking for
169
- - **24+ Languages**: Python, TypeScript, JavaScript, Go, Rust, Java, C/C++, and more
170
- - **Memory Layer**: Store and recall project context across sessions
171
- - **Incremental Indexing**: Only re-index changed files
172
- - **100% Local**: No API calls, no cloud, works offline
173
- - **GPU Acceleration**: Optional WebGPU/CUDA support
174
-
175
- ## Requirements
176
-
177
- - Node.js >= 20.0.0
178
- - ~500MB disk space for embedding model (downloaded on first run)
179
-
180
- ## Installation
181
-
182
- ### Quick Install (Recommended)
111
+ #### Step 1: Index your codebase (Required first)
183
112
 
184
113
  ```bash
185
- # Install to Claude Desktop
186
- npx codebaxing install
187
-
188
- # Install to Cursor
189
- npx codebaxing install --cursor
190
-
191
- # Install to Windsurf
192
- npx codebaxing install --windsurf
193
-
194
- # Install to all supported editors
195
- npx codebaxing install --all
114
+ npx codebaxing index /path/to/project
196
115
  ```
197
116
 
198
- Then restart your editor. Done!
199
-
200
- ### Uninstall
201
-
202
- ```bash
203
- npx codebaxing uninstall # Remove from Claude Desktop
204
- npx codebaxing uninstall --all # Remove from all editors
117
+ Output:
118
+ ```
119
+ 🔧 Codebaxing - Index Codebase
120
+
121
+ 📁 Path: /path/to/project
122
+
123
+ ================================================================================
124
+ INDEXING CODEBASE
125
+ ================================================================================
126
+ Found 47 files
127
+ Parsed 645 symbols from 47 files
128
+ Generating embeddings for 645 chunks...
129
+ Model loaded: Xenova/all-MiniLM-L6-v2 (384 dims, CPU)
130
+
131
+ ================================================================================
132
+ INDEXING COMPLETE
133
+ ================================================================================
134
+ Files parsed: 47
135
+ Symbols extracted: 645
136
+ Chunks created: 645
137
+ Time elapsed: 21.9s
205
138
  ```
206
139
 
207
- ### CLI Commands (Direct Usage)
208
-
209
- You can also use Codebaxing directly from terminal without AI agents:
140
+ #### Step 2: Search for code
210
141
 
211
142
  ```bash
212
- # Index a codebase
213
- npx codebaxing index /path/to/project
214
-
215
- # Search for code
216
143
  npx codebaxing search "authentication middleware"
217
144
  npx codebaxing search "database connection" --path ./src --limit 10
218
-
219
- # Show index statistics
220
- npx codebaxing stats /path/to/project
221
145
  ```
222
146
 
223
- **Note:** For persistent storage, run ChromaDB first:
224
- ```bash
225
- docker run -d -p 8000:8000 chromadb/chroma
226
- export CHROMADB_URL=http://localhost:8000
147
+ Output:
227
148
  ```
149
+ 🔧 Codebaxing - Search
228
150
 
229
- ### Manual Installation
151
+ 📁 Path: /path/to/project
152
+ 🔍 Query: "authentication middleware"
153
+ 📊 Limit: 5
230
154
 
231
- If you prefer manual configuration, see [Manual Configuration](#configure-claude-desktop) below.
155
+ ────────────────────────────────────────────────────────────
156
+ Results:
232
157
 
233
- ### (Optional) Persistent Storage
158
+ 1. src/middleware/auth.ts:15 - authMiddleware()
159
+ 2. src/services/auth.ts:42 - validateToken()
160
+ 3. src/routes/login.ts:8 - loginHandler()
234
161
 
235
- By default, the index is stored in memory and lost when the server restarts.
162
+ ────────────────────────────────────────────────────────────
163
+ ```
236
164
 
237
- For persistent storage, run ChromaDB:
165
+ #### Step 3: Check statistics
238
166
 
239
167
  ```bash
240
- # Using Docker (recommended)
241
- docker run -d -p 8000:8000 chromadb/chroma
242
-
243
- # Set environment variable
244
- export CHROMADB_URL=http://localhost:8000
168
+ npx codebaxing stats /path/to/project
245
169
  ```
246
170
 
247
- ### Manual Configuration
171
+ #### CLI Commands Reference
248
172
 
249
- #### Configure Claude Desktop
173
+ | Command | Description |
174
+ |---------|-------------|
175
+ | `npx codebaxing install [--editor]` | Install MCP server to AI editor |
176
+ | `npx codebaxing uninstall [--editor]` | Uninstall MCP server |
177
+ | `npx codebaxing index <path>` | Index a codebase |
178
+ | `npx codebaxing search <query> [options]` | Search indexed codebase |
179
+ | `npx codebaxing stats [path]` | Show index statistics |
180
+ | `npx codebaxing --help` | Show help |
250
181
 
251
- Add to your Claude Desktop config file:
182
+ **Search options:**
183
+ - `--path, -p <path>` - Codebase path (default: current directory)
184
+ - `--limit, -n <number>` - Number of results (default: 5)
252
185
 
253
- **macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json`
254
- **Windows**: `%APPDATA%\Claude\claude_desktop_config.json`
186
+ ## Installation
255
187
 
256
- #### Via npx (no install needed):
188
+ ### Option 1: Quick Install (Recommended)
257
189
 
258
- ```json
259
- {
260
- "mcpServers": {
261
- "codebaxing": {
262
- "command": "npx",
263
- "args": ["-y", "codebaxing"]
264
- }
265
- }
266
- }
190
+ ```bash
191
+ npx codebaxing install # Claude Desktop (default)
192
+ npx codebaxing install --cursor # Cursor
193
+ npx codebaxing install --windsurf # Windsurf (Codeium)
194
+ npx codebaxing install --zed # Zed
195
+ npx codebaxing install --all # All supported editors
267
196
  ```
268
197
 
269
- #### Via global install:
198
+ Then restart your editor.
270
199
 
271
- ```bash
272
- npm install -g codebaxing
273
- ```
200
+ ### Option 2: Manual Configuration
274
201
 
275
- ```json
276
- {
277
- "mcpServers": {
278
- "codebaxing": {
279
- "command": "codebaxing"
280
- }
281
- }
282
- }
283
- ```
202
+ #### Claude Desktop
284
203
 
285
- #### With persistent storage (ChromaDB):
204
+ Add to `~/Library/Application Support/Claude/claude_desktop_config.json` (macOS) or `%APPDATA%\Claude\claude_desktop_config.json` (Windows):
286
205
 
287
206
  ```json
288
207
  {
289
208
  "mcpServers": {
290
209
  "codebaxing": {
291
210
  "command": "npx",
292
- "args": ["-y", "codebaxing"],
293
- "env": {
294
- "CHROMADB_URL": "http://localhost:8000"
295
- }
211
+ "args": ["-y", "codebaxing"]
296
212
  }
297
213
  }
298
214
  }
299
215
  ```
300
216
 
301
- #### From source (development):
217
+ #### Cursor
218
+
219
+ Add to `~/.cursor/mcp.json`:
302
220
 
303
221
  ```json
304
222
  {
305
223
  "mcpServers": {
306
224
  "codebaxing": {
307
- "command": "node",
308
- "args": ["/path/to/codebaxing/dist/mcp/server.js"]
225
+ "command": "npx",
226
+ "args": ["-y", "codebaxing"]
309
227
  }
310
228
  }
311
229
  }
312
230
  ```
313
231
 
314
- ### Restart Claude Desktop
315
-
316
- The Codebaxing tools will now be available in Claude.
317
-
318
- ### Other AI Agents Integration
232
+ #### Windsurf
319
233
 
320
- #### Cursor
321
-
322
- Add to Cursor settings (`~/.cursor/mcp.json`):
234
+ Add to `~/.codeium/windsurf/mcp_config.json`:
323
235
 
324
236
  ```json
325
237
  {
@@ -332,16 +244,18 @@ Add to Cursor settings (`~/.cursor/mcp.json`):
332
244
  }
333
245
  ```
334
246
 
335
- #### Windsurf (Codeium)
247
+ #### Zed
336
248
 
337
- Add to Windsurf MCP config (`~/.codeium/windsurf/mcp_config.json`):
249
+ Add to `~/.config/zed/settings.json`:
338
250
 
339
251
  ```json
340
252
  {
341
- "mcpServers": {
253
+ "context_servers": {
342
254
  "codebaxing": {
343
- "command": "npx",
344
- "args": ["-y", "codebaxing"]
255
+ "command": {
256
+ "path": "npx",
257
+ "args": ["-y", "codebaxing"]
258
+ }
345
259
  }
346
260
  }
347
261
  }
@@ -349,7 +263,7 @@ Add to Windsurf MCP config (`~/.codeium/windsurf/mcp_config.json`):
349
263
 
350
264
  #### VS Code + Continue
351
265
 
352
- Add to Continue config (`~/.continue/config.json`):
266
+ Add to `~/.continue/config.json`:
353
267
 
354
268
  ```json
355
269
  {
@@ -367,79 +281,67 @@ Add to Continue config (`~/.continue/config.json`):
367
281
  }
368
282
  ```
369
283
 
370
- #### Zed
371
-
372
- Add to Zed settings (`~/.config/zed/settings.json`):
284
+ ### Uninstall
373
285
 
374
- ```json
375
- {
376
- "context_servers": {
377
- "codebaxing": {
378
- "command": {
379
- "path": "npx",
380
- "args": ["-y", "codebaxing"]
381
- }
382
- }
383
- }
384
- }
286
+ ```bash
287
+ npx codebaxing uninstall # Claude Desktop
288
+ npx codebaxing uninstall --all # All editors
385
289
  ```
386
290
 
387
- #### Generic MCP Client
388
-
389
- For any MCP-compatible client, use stdio transport:
291
+ ## How It Works
390
292
 
391
- ```bash
392
- # Command
393
- npx -y codebaxing
293
+ ### Architecture
394
294
 
395
- # Or if installed globally
396
- codebaxing
397
295
  ```
296
+ ┌─────────────────────────────────────────────────────────────────┐
297
+ │ INDEXING │
298
+ ├─────────────────────────────────────────────────────────────────┤
299
+ │ Source Files (.py, .ts, .js, .go, .rs, ...) │
300
+ │ │ │
301
+ │ ▼ │
302
+ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
303
+ │ │ Tree-sitter │───▶│ Symbols │───▶│ Embedding │ │
304
+ │ │ Parser │ │ Extraction │ │ Model │ │
305
+ │ └──────────────┘ └──────────────┘ └──────────────┘ │
306
+ │ │ │ │ │
307
+ │ Parse AST Functions, Text → Vector │
308
+ │ Classes, etc. (384 dimensions) │
309
+ │ │ │
310
+ │ ▼ │
311
+ │ ┌──────────────┐ │
312
+ │ │ ChromaDB │ │
313
+ │ │ (vectors) │ │
314
+ │ └──────────────┘ │
315
+ └─────────────────────────────────────────────────────────────────┘
398
316
 
399
- ## Usage
317
+ ┌─────────────────────────────────────────────────────────────────┐
318
+ │ SEARCH │
319
+ ├─────────────────────────────────────────────────────────────────┤
320
+ │ "find auth code" │
321
+ │ │ │
322
+ │ ▼ │
323
+ │ ┌──────────────┐ ┌──────────────┐ │
324
+ │ │ Embedding │────────▶│ ChromaDB │ │
325
+ │ │ Model │ query │ Query │ │
326
+ │ └──────────────┘ vector └──────────────┘ │
327
+ │ │ │
328
+ │ ▼ │
329
+ │ Cosine Similarity │
330
+ │ │ │
331
+ │ ▼ │
332
+ │ Top-k Results │
333
+ └─────────────────────────────────────────────────────────────────┘
334
+ ```
400
335
 
401
- ### MCP Tools
402
-
403
- | Tool | Description |
404
- |------|-------------|
405
- | `index` | Index a codebase. Modes: `auto` (incremental), `full`, `load-only` |
406
- | `search` | Semantic search. Returns ranked code chunks |
407
- | `stats` | Index statistics (files, symbols, chunks) |
408
- | `languages` | List supported file extensions |
409
- | `remember` | Store memories (conversation, status, decision, preference, doc, note) |
410
- | `recall` | Semantic search over memories |
411
- | `forget` | Delete memories by ID, type, tags, or age |
412
- | `memory-stats` | Memory statistics by type |
413
-
414
- ### Example Workflow
415
-
416
- 1. **Index your codebase:**
417
- ```
418
- index(path="/path/to/your/project")
419
- ```
420
-
421
- 2. **Search for code:**
422
- ```
423
- search(question="authentication middleware")
424
- search(question="database connection", language="typescript")
425
- search(question="error handling", symbol_type="function")
426
- ```
427
-
428
- 3. **Store context:**
429
- ```
430
- remember(content="Using PostgreSQL with Prisma ORM", memory_type="decision")
431
- remember(content="Auth uses JWT tokens", memory_type="doc", tags=["auth", "security"])
432
- ```
433
-
434
- 4. **Recall context:**
435
- ```
436
- recall(query="database setup")
437
- recall(query="authentication", memory_type="decision")
438
- ```
336
+ ### Why Semantic Search Works
439
337
 
440
- ## Supported Languages
338
+ The embedding model understands that:
441
339
 
442
- Python, JavaScript, TypeScript, C, C++, Bash, Go, Java, Kotlin, Rust, Ruby, C#, PHP, Scala, Swift, Lua, Dart, Elixir, Haskell, OCaml, Zig, Perl, CSS, HTML, Vue, JSON, YAML, TOML, Makefile
340
+ | Query | Finds (even without exact match) |
341
+ |-------|----------------------------------|
342
+ | "authentication" | login, credentials, auth, signin, validateUser |
343
+ | "database" | query, SQL, connection, ORM, repository |
344
+ | "error handling" | try/catch, exception, throw, ErrorBoundary |
443
345
 
444
346
  ## Configuration
445
347
 
@@ -448,62 +350,65 @@ Python, JavaScript, TypeScript, C, C++, Bash, Go, Java, Kotlin, Rust, Ruby, C#,
448
350
  | Variable | Description | Default |
449
351
  |----------|-------------|---------|
450
352
  | `CHROMADB_URL` | ChromaDB server URL for persistent storage | (in-memory) |
451
- | `CODEBAXING_DEVICE` | Compute device for embeddings | `cpu` |
353
+ | `CODEBAXING_DEVICE` | Compute device: `cpu`, `webgpu`, `cuda`, `auto` | `cpu` |
452
354
 
453
- ### GPU Acceleration
355
+ ### Persistent Storage
356
+
357
+ By default, the index is stored in memory and lost when the server restarts.
454
358
 
455
- Enable GPU for faster embedding generation:
359
+ For persistent storage:
456
360
 
457
361
  ```bash
458
- # WebGPU (experimental, uses Metal on macOS)
459
- export CODEBAXING_DEVICE=webgpu
460
-
461
- # Auto-detect best device
462
- export CODEBAXING_DEVICE=auto
362
+ # Start ChromaDB
363
+ docker run -d -p 8000:8000 chromadb/chroma
463
364
 
464
- # NVIDIA GPU (Linux/Windows only, requires CUDA)
465
- export CODEBAXING_DEVICE=cuda
365
+ # Set environment variable
366
+ export CHROMADB_URL=http://localhost:8000
466
367
  ```
467
368
 
468
- Default is `cpu` which works everywhere.
469
-
470
- **Note:** macOS does not support CUDA (no NVIDIA drivers). Use `webgpu` for GPU acceleration on Mac.
369
+ Or in MCP config:
471
370
 
472
- ### Storage
473
-
474
- Metadata is stored in `.codebaxing/` folder within your project:
475
- - `metadata.json` - Index metadata and file timestamps
371
+ ```json
372
+ {
373
+ "mcpServers": {
374
+ "codebaxing": {
375
+ "command": "npx",
376
+ "args": ["-y", "codebaxing"],
377
+ "env": {
378
+ "CHROMADB_URL": "http://localhost:8000"
379
+ }
380
+ }
381
+ }
382
+ }
383
+ ```
476
384
 
477
- ## Development
385
+ ### GPU Acceleration
478
386
 
479
387
  ```bash
480
- npm run dev # Run with tsx (no build needed)
481
- npm run build # Compile TypeScript
482
- npm start # Run compiled version
483
- npm test # Run tests
484
- npm run typecheck # Type check without emitting
388
+ export CODEBAXING_DEVICE=webgpu # macOS (Metal)
389
+ export CODEBAXING_DEVICE=cuda # Linux/Windows (NVIDIA)
390
+ export CODEBAXING_DEVICE=auto # Auto-detect
485
391
  ```
486
392
 
487
- ### Testing
393
+ **Note:** macOS does not support CUDA. Use `webgpu` for GPU acceleration on Mac.
488
394
 
489
- ```bash
490
- # Run unit tests
491
- npm test
395
+ ## Supported Languages
492
396
 
493
- # Test indexing manually
494
- CHROMADB_URL=http://localhost:8000 npx tsx test-indexing.ts
495
- ```
397
+ Python, JavaScript, TypeScript, C, C++, Bash, Go, Java, Kotlin, Rust, Ruby, C#, PHP, Scala, Swift, Lua, Dart, Elixir, Haskell, OCaml, Zig, Perl, CSS, HTML, Vue, JSON, YAML, TOML, Makefile
398
+
399
+ ## Features
496
400
 
497
- ## Comparison: Grep vs Semantic Search
401
+ - **Semantic Code Search**: Find code by describing what you're looking for
402
+ - **24+ Languages**: Python, TypeScript, JavaScript, Go, Rust, Java, C/C++, and more
403
+ - **Memory Layer**: Store and recall project context across sessions
404
+ - **Incremental Indexing**: Only re-index changed files
405
+ - **100% Local**: No API calls, no cloud, works offline
406
+ - **GPU Acceleration**: Optional WebGPU/CUDA support
498
407
 
499
- | Aspect | Grep | Semantic Search |
500
- |--------|------|-----------------|
501
- | Query | Exact text match | Natural language |
502
- | "authentication" | Only finds "authentication" | Finds login, auth, credentials, etc. |
503
- | Understands context | No | Yes |
504
- | Finds synonyms | No | Yes |
505
- | Speed | Very fast | Fast (after indexing) |
506
- | Setup | None | Requires indexing |
408
+ ## Requirements
409
+
410
+ - Node.js >= 20.0.0
411
+ - ~500MB disk space for embedding model (downloaded on first run)
507
412
 
508
413
  ## Technical Details
509
414
 
@@ -515,6 +420,15 @@ CHROMADB_URL=http://localhost:8000 npx tsx test-indexing.ts
515
420
  | Code Parser | Tree-sitter |
516
421
  | MCP SDK | `@modelcontextprotocol/sdk` |
517
422
 
423
+ ## Development
424
+
425
+ ```bash
426
+ npm install # Install dependencies
427
+ npm run dev # Run with tsx (no build needed)
428
+ npm run build # Compile TypeScript
429
+ npm test # Run tests
430
+ ```
431
+
518
432
  ## License
519
433
 
520
434
  MIT