@autodev/codebase 0.0.4 → 0.0.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,231 +1,321 @@
1
-
2
-
3
1
  # @autodev/codebase
4
2
 
5
3
  <div align="center">
6
- <img src="src/images/image2.png" alt="Image 2" style="display: inline-block; width: 350px; margin: 0 10px;" />
7
- <img src="src/images/image3.png" alt="Image 3" style="display: inline-block; width: 200px; margin: 0 10px;" />
4
+
5
+ [![npm version](https://img.shields.io/npm/v/@autodev/codebase)](https://www.npmjs.com/package/@autodev/codebase)
6
+ [![GitHub stars](https://img.shields.io/github/stars/anrgct/autodev-codebase)](https://github.com/anrgct/autodev-codebase)
7
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
8
+
8
9
  </div>
9
10
 
10
- <br />
11
+ ```sh
12
+ ╭─ ~/workspace/autodev-codebase
13
+ ╰─❯ codebase --demo --search="user manage"
14
+ Found 3 results in 2 files for: "user manage"
15
+
16
+ ==================================================
17
+ File: "hello.js"
18
+ ==================================================
19
+ < class UserManager > (L7-20)
20
+ class UserManager {
21
+ constructor() {
22
+ this.users = [];
23
+ }
24
+
25
+ addUser(user) {
26
+ this.users.push(user);
27
+ console.log('User added:', user.name);
28
+ }
29
+
30
+ getUsers() {
31
+ return this.users;
32
+ }
33
+ }
11
34
 
12
- A platform-agnostic code analysis library with semantic search capabilities and MCP (Model Context Protocol) server support. This library provides intelligent code indexing, vector-based semantic search, and can be integrated into various development tools and IDEs.
35
+ ==================================================
36
+ File: "README.md" | 2 snippets
37
+ ==================================================
38
+ < md_h1 Demo Project > md_h2 Usage > md_h3 JavaScript Functions > (L16-20)
39
+ ### JavaScript Functions
40
+
41
+ - greetUser(name) - Greets a user by name
42
+ - UserManager - Class for managing user data
43
+
44
+ ─────
45
+ < md_h1 Demo Project > md_h2 Search Examples > (L27-38)
46
+ ## Search Examples
47
+
48
+ Try searching for:
49
+ - "greet user"
50
+ - "process data"
51
+ - "user management"
52
+ - "batch processing"
53
+ - "YOLO model"
54
+ - "computer vision"
55
+ - "object detection"
56
+ - "model training"
57
+
58
+ ```
59
+ A vector embedding-based code semantic search tool with MCP server and multi-model integration. Can be used as a pure CLI tool. Supports Ollama for fully local embedding and reranking, enabling complete offline operation and privacy protection for your code repository.
13
60
 
14
61
  ## 🚀 Features
15
62
 
16
- - **Semantic Code Search**: Vector-based code search using embeddings
17
- - **MCP Server Support**: HTTP-based MCP server for IDE integration
18
- - **Terminal UI**: Interactive CLI with rich terminal interface
19
- - **Tree-sitter Parsing**: Advanced code parsing and analysis
20
- - **Vector Storage**: Qdrant vector database integration
21
- - **Flexible Embedding**: Support for various embedding models via Ollama
63
+ - **🔍 Semantic Code Search**: Vector-based search using advanced embedding models
64
+ - **🌐 MCP Server**: HTTP-based MCP server with SSE and stdio adapters
65
+ - **💻 Pure CLI Tool**: Standalone command-line interface without GUI dependencies
66
+ - **⚙️ Layered Configuration**: CLI, project, and global config management
67
+ - **🎯 Advanced Path Filtering**: Glob patterns with brace expansion and exclusions
68
+ - **🌲 Tree-sitter Parsing**: Support for 40+ programming languages
69
+ - **💾 Qdrant Integration**: High-performance vector database
70
+ - **🔄 Multiple Providers**: OpenAI, Ollama, Jina, Gemini, Mistral, OpenRouter, Vercel
71
+ - **📊 Real-time Watching**: Automatic index updates
72
+ - **⚡ Batch Processing**: Efficient parallel processing
22
73
 
23
74
  ## 📦 Installation
24
75
 
25
- ### 1. Install and Start Ollama
26
-
76
+ ### 1. Dependencies
27
77
  ```bash
28
- # Install Ollama (macOS)
29
- brew install ollama
30
-
31
- # Start Ollama service
78
+ brew install ollama ripgrep
32
79
  ollama serve
33
-
34
- # In a new terminal, pull the embedding model
35
80
  ollama pull nomic-embed-text
36
81
  ```
37
82
 
38
- ### 2. Install and Start Qdrant
39
-
40
- Start Qdrant using Docker:
83
+ ### 2. Qdrant
84
+ ```bash
85
+ docker run -d -p 6333:6333 -p 6334:6334 --name qdrant qdrant/qdrant
86
+ ```
41
87
 
88
+ ### 3. Install
42
89
  ```bash
43
- # Start Qdrant container
44
- docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant
90
+ npm install -g @autodev/codebase
91
+ codebase --set-config embedderProvider=ollama,embedderModelId=nomic-embed-text
45
92
  ```
46
93
 
47
- Or download and run Qdrant directly:
94
+ ## 🛠️ Quick Start
48
95
 
49
96
  ```bash
50
- # Download and run Qdrant
51
- wget https://github.com/qdrant/qdrant/releases/latest/download/qdrant-x86_64-unknown-linux-gnu.tar.gz
52
- tar -xzf qdrant-x86_64-unknown-linux-gnu.tar.gz
53
- ./qdrant
97
+ # Demo mode (recommended for first-time)
98
+ # Creates a demo directory in current working directory for testing
99
+
100
+ # Index & search
101
+ codebase --demo --index
102
+ codebase --demo --search="user greet"
103
+
104
+ # MCP server
105
+ codebase --demo --serve
54
106
  ```
55
107
 
56
- ### 3. Verify Services Are Running
108
+ ## 📋 Commands
57
109
 
110
+ ### Indexing & Search
58
111
  ```bash
59
- # Check Ollama
60
- curl http://localhost:11434/api/tags
112
+ # Index the codebase
113
+ codebase --index --path=/my/project --force
114
+
115
+ # Search with filters
116
+ codebase --search="error handling" --path-filters="src/**/*.ts"
117
+
118
+ # Search with custom limit and minimum score
119
+ codebase --search="authentication" --limit=20 --min-score=0.7
120
+ codebase --search="API" -l 30 -s 0.5
121
+
122
+ # Search in JSON format
123
+ codebase --search="authentication" --json
61
124
 
62
- # Check Qdrant
63
- curl http://localhost:6333/collections
125
+ # Clear index data
126
+ codebase --clear --path=/my/project
64
127
  ```
65
- ### 4. Install Autodev-codebase
66
128
 
129
+ ### MCP Server
67
130
  ```bash
68
- npm install -g @autodev/codebase
69
- ```
131
+ # HTTP mode (recommended)
132
+ codebase --serve --port=3001 --path=/my/project
70
133
 
71
- Alternatively, you can install it locally:
134
+ # Stdio adapter
135
+ codebase --stdio-adapter --server-url=http://localhost:3001/mcp
72
136
  ```
73
- git clone https://github.com/anrgct/autodev-codebase
74
- cd autodev-codebase
75
- npm install
76
- npm run build
77
- npm link
137
+
138
+ ### Configuration
139
+ ```bash
140
+ # View config
141
+ codebase --get-config
142
+ codebase --get-config embedderProvider --json
143
+
144
+ # Set config
145
+ codebase --set-config embedderProvider=ollama,embedderModelId=nomic-embed-text
146
+ codebase --set-config --global qdrantUrl=http://localhost:6333
78
147
  ```
79
- ## 🛠️ Usage
80
148
 
81
- ### Command Line Interface
149
+ ### Advanced Features
82
150
 
83
- The CLI provides two main modes:
151
+ #### 🔍 LLM-Powered Search Reranking
152
+ Enable LLM reranking to dramatically improve search relevance:
84
153
 
85
- #### 1. Interactive TUI Mode (Default)
86
154
  ```bash
87
- # Basic usage: index your current folder as the codebase.
88
- # Be cautious when running this command if you have a large number of files.
89
- codebase
155
+ # Enable reranking with Ollama (recommended)
156
+ codebase --set-config rerankerEnabled=true,rerankerProvider=ollama,rerankerOllamaModelId=qwen3-vl:4b-instruct
90
157
 
158
+ # Or use OpenAI-compatible providers
159
+ codebase --set-config rerankerEnabled=true,rerankerProvider=openai-compatible,rerankerOpenAiCompatibleModelId=deepseek-chat
91
160
 
92
- # With custom options
93
- codebase --demo # Create a local demo directory and test the indexing service, recommend for setup
94
- codebase --path=/my/project
95
- codebase --path=/my/project --log-level=info
161
+ # Search with automatic reranking
162
+ codebase --search="user authentication" # Results are automatically reranked by LLM
96
163
  ```
97
164
 
98
- #### 2. MCP Server Mode (Recommended for IDE Integration)
165
+ **Benefits:**
166
+ - 🎯 **Higher precision**: LLM understands semantic relevance beyond vector similarity
167
+ - 📊 **Smart scoring**: Results are reranked on a 0-10 scale based on query relevance
168
+ - ⚡ **Batch processing**: Efficiently handles large result sets with configurable batch sizes
169
+ - 🎛️ **Threshold control**: Filter results with `rerankerMinScore` to keep only high-quality matches
170
+
171
+ #### Path Filtering & Export
99
172
  ```bash
100
- # Start long-running MCP server
101
- cd /my/project
102
- codebase mcp-server
173
+ # Path filtering with brace expansion and exclusions
174
+ codebase --search="API" --path-filters="src/**/*.ts,lib/**/*.js"
175
+ codebase --search="utils" --path-filters="{src,test}/**/*.ts"
103
176
 
104
- # With custom configuration
105
- codebase mcp-server --port=3001 --host=localhost
106
- codebase mcp-server --path=/workspace --port=3002
177
+ # Export results in JSON format for scripts
178
+ codebase --search="auth" --json
107
179
  ```
108
180
 
109
- ### IDE Integration (Cursor/Claude)
181
+ ## ⚙️ Configuration
182
+
183
+ ### Config Layers (Priority Order)
184
+ 1. **CLI Arguments** - Runtime parameters (`--path`, `--config`, `--log-level`, `--force`, etc.)
185
+ 2. **Project Config** - `./autodev-config.json` (or custom path via `--config`)
186
+ 3. **Global Config** - `~/.autodev-cache/autodev-config.json`
187
+ 4. **Built-in Defaults** - Fallback values
188
+
189
+ **Note:** CLI arguments provide runtime override for paths, logging, and operational behavior. For persistent configuration (embedderProvider, API keys, search parameters), use `--set-config` to save to config files.
110
190
 
111
- Configure your IDE to connect to the MCP server:
191
+ ### Common Config Examples
112
192
 
193
+ **Ollama:**
113
194
  ```json
114
195
  {
115
- "mcpServers": {
116
- "codebase": {
117
- "url": "http://localhost:3001/sse"
118
- }
119
- }
196
+ "embedderProvider": "ollama",
197
+ "embedderModelId": "nomic-embed-text",
198
+ "qdrantUrl": "http://localhost:6333"
120
199
  }
121
200
  ```
122
201
 
123
- For clients that do not support SSE MCP, you can use the following configuration:
124
-
202
+ **OpenAI:**
125
203
  ```json
126
204
  {
127
- "mcpServers": {
128
- "codebase": {
129
- "command": "codebase",
130
- "args": [
131
- "stdio-adapter",
132
- "--server-url=http://localhost:3001/sse"
133
- ]
134
- }
135
- }
205
+ "embedderProvider": "openai",
206
+ "embedderModelId": "text-embedding-3-small",
207
+ "embedderOpenAiApiKey": "sk-your-key",
208
+ "qdrantUrl": "http://localhost:6333"
136
209
  }
137
210
  ```
138
211
 
139
- ### Library Usage
212
+ **OpenAI-Compatible:**
213
+ ```json
214
+ {
215
+ "embedderProvider": "openai-compatible",
216
+ "embedderModelId": "text-embedding-3-small",
217
+ "embedderOpenAiCompatibleApiKey": "sk-your-key",
218
+ "embedderOpenAiCompatibleBaseUrl": "https://api.openai.com/v1"
219
+ }
220
+ ```
140
221
 
141
- #### Node.js Usage
142
- ```typescript
143
- import { createNodeDependencies } from '@autodev/codebase/adapters/nodejs'
144
- import { CodeIndexManager } from '@autodev/codebase'
222
+ ### Key Configuration Options
145
223
 
146
- const deps = createNodeDependencies({
147
- workspacePath: '/path/to/project',
148
- storageOptions: { /* ... */ },
149
- loggerOptions: { /* ... */ },
150
- configOptions: { /* ... */ }
151
- })
224
+ | Category | Options | Description |
225
+ |----------|---------|-------------|
226
+ | **Embedding** | `embedderProvider`, `embedderModelId`, `embedderModelDimension` | Provider and model settings |
227
+ | **API Keys** | `embedderOpenAiApiKey`, `embedderOpenAiCompatibleApiKey` | Authentication |
228
+ | **Vector Store** | `qdrantUrl`, `qdrantApiKey` | Qdrant connection |
229
+ | **Search** | `vectorSearchMinScore`, `vectorSearchMaxResults` | Search behavior |
230
+ | **Reranker** | `rerankerEnabled`, `rerankerProvider` | Result reranking |
152
231
 
153
- const manager = CodeIndexManager.getInstance(deps)
154
- await manager.initialize()
155
- await manager.startIndexing()
156
- ```
232
+ **Key CLI Arguments:**
233
+ - `--serve` / `--index` / `--search` - Core operations
234
+ - `--get-config` / `--set-config` - Configuration management
235
+ - `--path`, `--demo`, `--force` - Common options
236
+ - `--limit` / `-l <number>` - Maximum number of search results (default: from config, max 50)
237
+ - `--min-score` / `-s <number>` - Minimum similarity score for search results (0-1, default: from config)
238
+ - `--help` - Show all available options
157
239
 
158
- ## 🔧 CLI Options
240
+ For complete CLI reference, see [CONFIG.md](CONFIG.md).
159
241
 
160
- ### Global Options
161
- - `--path=<path>` - Workspace path (default: current directory)
162
- - `--demo` - Create demo files in workspace
163
- - `--ollama-url=<url>` - Ollama API URL (default: http://localhost:11434)
164
- - `--qdrant-url=<url>` - Qdrant vector DB URL (default: http://localhost:6333)
165
- - `--model=<model>` - Embedding model (default: nomic-embed-text)
166
- - `--config=<path>` - Config file path
167
- - `--storage=<path>` - Storage directory path
168
- - `--cache=<path>` - Cache directory path
169
- - `--log-level=<level>` - Log level: error|warn|info|debug (default: error)
170
- - `--help, -h` - Show help
242
+ **Configuration Commands:**
243
+ ```bash
244
+ # View config
245
+ codebase --get-config
246
+ codebase --get-config --json
171
247
 
172
- ### MCP Server Options
173
- - `--port=<port>` - HTTP server port (default: 3001)
174
- - `--host=<host>` - HTTP server host (default: localhost)
248
+ # Set config (saves to file)
249
+ codebase --set-config embedderProvider=ollama,embedderModelId=nomic-embed-text
250
+ codebase --set-config --global embedderProvider=openai,embedderOpenAiApiKey=sk-xxx
175
251
 
176
- ## 🌐 MCP Server Features
252
+ # Use custom config file
253
+ codebase --config=/path/to/config.json --get-config
254
+ codebase --config=/path/to/config.json --set-config embedderProvider=ollama
177
255
 
178
- ### Web Interface
179
- - **Home Page**: `http://localhost:3001` - Server status and configuration
180
- - **Health Check**: `http://localhost:3001/health` - JSON status endpoint
181
- - **MCP Endpoint**: `http://localhost:3001/sse` - SSE/HTTP MCP protocol endpoint
256
+ # Runtime override (paths, logging, etc.)
257
+ codebase --index --path=/my/project --log-level=info --force
258
+ ```
182
259
 
183
- ### Available MCP Tools
184
- - **`search_codebase`** - Semantic search through your codebase
185
- - Parameters: `query` (string), `limit` (number), `filters` (object)
186
- - Returns: Formatted search results with file paths, scores, and code blocks
187
- - **`get_search_stats`** - Get indexing status and statistics
188
- - **`configure_search`** - Configure search parameters at runtime
260
+ For complete configuration reference, see [CONFIG.md](CONFIG.md).
189
261
 
262
+ ## 🔌 MCP Integration
190
263
 
191
- ### Scripts
264
+ ### HTTP Streamable Mode (Recommended)
192
265
  ```bash
193
- # Development mode with demo files
194
- npm run dev
266
+ codebase --serve --port=3001
267
+ ```
195
268
 
196
- # Build for production
197
- npm run build
269
+ **IDE Config:**
270
+ ```json
271
+ {
272
+ "mcpServers": {
273
+ "codebase": {
274
+ "url": "http://localhost:3001/mcp"
275
+ }
276
+ }
277
+ }
278
+ ```
198
279
 
199
- # Type checking
200
- npm run type-check
280
+ ### Stdio Adapter
281
+ ```bash
282
+ # First start the MCP server in one terminal
283
+ codebase --serve --port=3001
201
284
 
202
- # Run TUI demo
203
- npm run demo-tui
285
+ # Then connect via stdio adapter in another terminal (for IDEs that require stdio)
286
+ codebase --stdio-adapter --server-url=http://localhost:3001/mcp
287
+ ```
204
288
 
205
- # Start MCP server demo
206
- npm run mcp-server
289
+ **IDE Config:**
290
+ ```json
291
+ {
292
+ "mcpServers": {
293
+ "codebase": {
294
+ "command": "codebase",
295
+ "args": ["stdio-adapter", "--server-url=http://localhost:3001/mcp"]
296
+ }
297
+ }
298
+ }
207
299
  ```
208
300
 
209
- ## 💡 Why Use MCP Server Mode?
301
+ ## 🤝 Contributing
302
+
303
+ Contributions are welcome! Please feel free to submit a Pull Request or open an Issue on [GitHub](https://github.com/anrgct/autodev-codebase).
210
304
 
211
- ### Problems Solved
212
- - **❌ Repeated Indexing**: Every IDE connection re-indexes, wasting time and resources
213
- - **❌ Complex Configuration**: Each project needs different path parameters in IDE
214
- - **❌ Resource Waste**: Multiple IDE windows start multiple server instances
305
+ ## 📄 License
215
306
 
216
- ### Benefits
217
- - **✅ One-time Indexing**: Server runs long-term, index persists
218
- - **✅ Simplified Configuration**: Universal IDE configuration, no project-specific paths
219
- - **✅ Resource Efficiency**: One server instance per project
220
- - **✅ Better Developer Experience**: Start server in project directory intuitively
221
- - **✅ Backward Compatible**: Still supports traditional per-connection mode
222
- - **✅ Web Interface**: Status monitoring and configuration help
223
- - **✅ Dual Mode**: Can run both TUI and MCP server simultaneously
307
+ This project is licensed under the [MIT License](https://opensource.org/licenses/MIT).
224
308
 
309
+ ## 🙏 Acknowledgments
225
310
 
226
- This is a platform-agnostic library extracted from the roo-code VSCode plugin.
227
- ## 📚 Examples
311
+ This project is a fork and derivative work based on [Roo Code](https://github.com/RooCodeInc/Roo-Code). We've built upon their excellent foundation to create this specialized codebase analysis tool with enhanced features and MCP server capabilities.
228
312
 
229
- See the `examples/` directory for complete usage examples:
230
- - `nodejs-usage.ts` - Node.js integration examples
231
- - `run-demo-tui.tsx` - TUI demo application
313
+ ---
314
+
315
+ <div align="center">
316
+
317
+ **🌟 If you find this tool helpful, please give us a [star on GitHub](https://github.com/anrgct/autodev-codebase)!**
318
+
319
+ Made with ❤️ for the developer community
320
+
321
+ </div>