opencode-codebase-index 0.1.0 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,92 +1,148 @@
1
1
  # opencode-codebase-index
2
2
 
3
- Semantic codebase indexing and search for OpenCode. Find code by meaning, not just keywords.
3
+ [![npm version](https://img.shields.io/npm/v/opencode-codebase-index.svg)](https://www.npmjs.com/package/opencode-codebase-index)
4
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
5
+ [![Downloads](https://img.shields.io/npm/dm/opencode-codebase-index.svg)](https://www.npmjs.com/package/opencode-codebase-index)
6
+ [![Build Status](https://img.shields.io/github/actions/workflow/status/Helweg/opencode-codebase-index/ci.yml?branch=main)](https://github.com/Helweg/opencode-codebase-index/actions)
7
+ [![Node.js](https://img.shields.io/badge/node-%3E%3D18-brightgreen.svg)](https://nodejs.org/)
4
8
 
5
- ## When to Use
9
+ > **Stop grepping for concepts. Start searching for meaning.**
6
10
 
7
- | Scenario | Tool | Why |
8
- | ------------------------------- | ------------------- | ----------------------------- |
9
- | Don't know function/class names | `codebase_search` | Natural language → code |
10
- | Exploring unfamiliar codebase | `codebase_search` | Finds related code by meaning |
11
- | Know exact identifier | `grep` | Faster, finds all occurrences |
12
- | Need ALL matches | `grep` | Semantic returns top N only |
11
+ **opencode-codebase-index** brings semantic understanding to your [OpenCode](https://opencode.ai) workflow. Instead of guessing function names or grepping for keywords, ask your codebase questions in plain English.
13
12
 
14
- **Best workflow:** Semantic search for discovery → grep for precision.
13
+ ## 🚀 Why Use This?
15
14
 
16
- ## Installation
15
+ - 🧠 **Semantic Search**: Finds "user authentication" logic even if the function is named `check_creds`.
16
+ - ⚡ **Blazing Fast Indexing**: Powered by a Rust native module using `tree-sitter` and `usearch`. Incremental updates take milliseconds.
17
+ - 🔒 **Privacy Focused**: Your vector index is stored locally in your project.
18
+ - 🔌 **Model Agnostic**: Works out-of-the-box with GitHub Copilot, OpenAI, Gemini, or local Ollama models.
17
19
 
18
- ```bash
19
- npm install opencode-codebase-index
20
- ```
20
+ ## ⚡ Quick Start
21
21
 
22
- Add to your `opencode.json`:
22
+ 1. **Install the plugin**
23
+ ```bash
24
+ npm install opencode-codebase-index
25
+ ```
23
26
 
24
- ```json
25
- {
26
- "plugin": ["opencode-codebase-index"]
27
- }
28
- ```
27
+ 2. **Add to `opencode.json`**
28
+ ```json
29
+ {
30
+ "plugin": ["opencode-codebase-index"]
31
+ }
32
+ ```
29
33
 
30
- ## Tools
34
+ 3. **Start Searching**
35
+ Load OpenCode and ask:
36
+ > "Find the function that handles credit card validation errors"
31
37
 
32
- ### `codebase_search`
38
+ *The plugin will automatically index your codebase on the first run.*
39
+
40
+ ## 🔍 See It In Action
41
+
42
+ **Scenario**: You're new to a codebase and need to fix a bug in the payment flow.
33
43
 
34
- Search code by describing what it does. Returns focused results (5-10 files).
44
+ **Without Plugin (grep)**:
45
+ - `grep "payment" .` → 500 results (too many)
46
+ - `grep "card" .` → 200 results (mostly UI)
47
+ - `grep "stripe" .` → 50 results (maybe?)
35
48
 
49
+ **With `opencode-codebase-index`**:
50
+ You ask: *"Where is the payment validation logic?"*
51
+
52
+ Plugin returns:
53
+ ```text
54
+ src/services/billing.ts:45 (Class PaymentValidator)
55
+ src/utils/stripe.ts:12 (Function validateCardToken)
56
+ src/api/checkout.ts:89 (Route handler for /pay)
36
57
  ```
37
- "find the user authentication logic"
38
- "code that handles database connections"
39
- "error handling middleware for HTTP requests"
58
+
59
+ ## 🎯 When to Use What
60
+
61
+ | Scenario | Tool | Why |
62
+ |----------|------|-----|
63
+ | Don't know the function name | `codebase_search` | Semantic search finds by meaning |
64
+ | Exploring unfamiliar codebase | `codebase_search` | Discovers related code across files |
65
+ | Know exact identifier | `grep` | Faster, finds all occurrences |
66
+ | Need ALL matches | `grep` | Semantic returns top N only |
67
+ | Mixed discovery + precision | `/find` (hybrid) | Best of both worlds |
68
+
69
+ **Rule of thumb**: Semantic search for discovery → grep for precision.
70
+
71
+ ## 🛠️ How It Works
72
+
73
+ ```mermaid
74
+ graph TD
75
+ subgraph Indexing
76
+ A[Source Code] -->|Tree-sitter| B[Semantic Chunks]
77
+ B -->|Embedding Model| C[Vectors]
78
+ C -->|uSearch| D[(Local Vector Store)]
79
+ end
80
+
81
+ subgraph Searching
82
+ Q[User Query] -->|Embedding Model| V[Query Vector]
83
+ V -->|Cosine Similarity| D
84
+ D --> R[Ranked Results]
85
+ end
40
86
  ```
41
87
 
42
- **Good queries describe behavior:**
88
+ 1. **Parsing**: We use `tree-sitter` to intelligently parse your code into meaningful blocks (functions, classes, interfaces).
89
+ 2. **Embedding**: These blocks are converted into vector representations using your configured AI provider.
90
+ 3. **Storage**: Vectors are stored in a high-performance local index using `usearch`.
91
+ 4. **Search**: Your natural language queries are matched against this index to find the most semantically relevant code.
43
92
 
44
- - "function that validates email addresses"
45
- - "middleware that checks JWT tokens"
46
- - "error handling for payment failures"
93
+ **Performance characteristics:**
94
+ - **Incremental indexing**: ~50ms check time — only re-embeds changed files
95
+ - **Smart chunking**: Understands code structure to keep functions whole
96
+ - **Native speed**: Core logic written in Rust for maximum performance
47
97
 
48
- **Use grep instead for:**
98
+ ## 🧰 Tools Available
49
99
 
50
- - Exact names: `validateEmail`, `UserService`
51
- - Keywords: `TODO`, `FIXME`
52
- - Literals: `401`, `error`
100
+ The plugin exposes these tools to the OpenCode agent:
53
101
 
54
- ### `index_codebase`
102
+ ### `codebase_search`
103
+ **The primary tool.** Searches code by describing behavior.
104
+ - **Use for**: Discovery, understanding flows, finding logic when you don't know the names.
105
+ - **Example**: `"find the middleware that sanitizes input"`
55
106
 
56
- Create or update the semantic index. Incremental indexing is fast (~50ms when nothing changed).
107
+ **Writing good queries:**
57
108
 
58
- | Parameter | Type | Default | Description |
59
- | ---------------- | ------- | ------- | ----------------------- |
60
- | `force` | boolean | false | Reindex from scratch |
61
- | `estimateOnly` | boolean | false | Show cost estimate only |
109
+ | Good queries (describe behavior) | ❌ Bad queries (too vague) |
110
+ |-------------------------------------|---------------------------|
111
+ | "function that validates email format" | "email" |
112
+ | "error handling for failed API calls" | "error" |
113
+ | "middleware that checks authentication" | "auth middleware" |
114
+ | "code that calculates shipping costs" | "shipping" |
115
+ | "where user permissions are checked" | "permissions" |
62
116
 
63
- ### `index_status`
117
+ ### `index_codebase`
118
+ Manually trigger indexing.
119
+ - **Use for**: Forcing a re-index or checking stats.
120
+ - **Parameters**: `force` (rebuild all), `estimateOnly` (check costs).
64
121
 
65
- Check if the codebase is indexed and ready for search.
122
+ ### `index_status`
123
+ Checks if the index is ready and healthy.
66
124
 
67
125
  ### `index_health_check`
126
+ Maintenance tool to remove stale entries from deleted files.
68
127
 
69
- Remove stale entries from deleted files.
70
-
71
- ## Slash Commands
128
+ ## 🎮 Slash Commands
72
129
 
73
- Copy the commands from `commands/` to your project's `.opencode/command/` directory:
130
+ For easier access, you can add slash commands to your project.
74
131
 
132
+ Copy the commands:
75
133
  ```bash
76
134
  cp -r node_modules/opencode-codebase-index/commands/* .opencode/command/
77
135
  ```
78
136
 
79
- Available commands:
80
-
81
- | Command | Description |
82
- | ------------------- | ----------------------------------- |
83
- | `/search <query>` | Semantic search for code by meaning |
84
- | `/index` | Create or update the semantic index |
85
- | `/find <query>` | Hybrid search (semantic + grep) |
137
+ | Command | Description |
138
+ | ------- | ----------- |
139
+ | `/search <query>` | **Pure Semantic Search**. Best for "How does X work?" |
140
+ | `/find <query>` | **Hybrid Search**. Combines semantic search + grep. Best for "Find usage of X". |
141
+ | `/index` | **Update Index**. Forces a refresh of the codebase index. |
86
142
 
87
- ## Configuration
143
+ ## ⚙️ Configuration
88
144
 
89
- Optional configuration in `.opencode/codebase-index.json`:
145
+ Zero-config by default (uses `auto` mode). Customize in `.opencode/codebase-index.json`:
90
146
 
91
147
  ```json
92
148
  {
@@ -106,91 +162,70 @@ Optional configuration in `.opencode/codebase-index.json`:
106
162
  }
107
163
  ```
108
164
 
109
- | Option | Default | Description |
110
- | ------------------------ | ------------- | ---------------------------------------------------------------- |
111
- | `embeddingProvider` | `"auto"` | `auto`, `github-copilot`, `openai`, `google`, `ollama` |
112
- | `scope` | `"project"` | `project` (local) or `global` (shared) |
113
- | `indexing.autoIndex` | `false` | Auto-index on plugin load |
114
- | `indexing.watchFiles` | `true` | Watch for file changes and re-index |
115
- | `indexing.maxFileSize` | `1048576` | Max file size in bytes (1MB) |
116
- | `search.maxResults` | `20` | Max results to return |
117
- | `search.minScore` | `0.1` | Minimum similarity score |
118
- | `search.hybridWeight` | `0.5` | Keyword vs semantic balance (0=semantic only, 1=keyword only) |
119
- | `search.contextLines` | `0` | Extra lines to include before/after each match |
165
+ ### Options Reference
166
+
167
+ | Option | Default | Description |
168
+ |--------|---------|-------------|
169
+ | `embeddingProvider` | `"auto"` | Which AI to use: `auto`, `github-copilot`, `openai`, `google`, `ollama` |
170
+ | `scope` | `"project"` | `project` = index per repo, `global` = shared index across repos |
171
+ | **indexing** | | |
172
+ | `autoIndex` | `false` | Automatically index on plugin load |
173
+ | `watchFiles` | `true` | Re-index when files change |
174
+ | `maxFileSize` | `1048576` | Skip files larger than this (bytes). Default: 1MB |
175
+ | **search** | | |
176
+ | `maxResults` | `20` | Maximum results to return |
177
+ | `minScore` | `0.1` | Minimum similarity score (0-1). Lower = more results |
178
+ | `hybridWeight` | `0.5` | Balance between keyword (1.0) and semantic (0.0) search |
179
+ | `contextLines` | `0` | Extra lines to include before/after each match |
120
180
 
121
181
  ### Embedding Providers
122
-
123
- Uses OpenCode's authentication. Auto-detected in order:
124
-
125
- 1. **GitHub Copilot** - Uses Copilot API
126
- 2. **OpenAI** - Uses OpenAI API
127
- 3. **Google** - Uses Gemini API
128
- 4. **Ollama** - Local, requires `nomic-embed-text` or similar
129
-
130
- ## How It Works
131
-
132
- 1. **Parsing** - Tree-sitter extracts semantic chunks (functions, classes, etc.)
133
- 2. **Embedding** - Chunks converted to vectors via embedding API
134
- 3. **Storage** - Vectors stored locally using usearch
135
- 4. **Search** - Query embedded and compared via cosine similarity + keyword matching
136
-
137
- Index stored in `.opencode/index/` within your project.
138
-
139
- ## Performance
140
-
141
- - **Incremental indexing**: ~50ms when no files changed
142
- - **Full index**: Depends on codebase size (Express.js: ~30s for 472 chunks)
143
- - **Search latency**: ~800-1000ms (embedding API call)
144
- - **Token savings**: 99%+ vs reading all files
145
-
146
- ## Requirements
147
-
148
- - Node.js >= 18
149
- - Rust toolchain (for building native module)
150
-
151
- ## Building
152
-
153
- ```bash
154
- npm run build # Full build (TS + Rust)
155
- npm run build:ts # TypeScript only
156
- npm run test # Run tests
157
- npm run typecheck # TypeScript type checking
158
- ```
159
-
160
- ## Local Development
161
-
162
- To test the plugin locally without publishing to npm:
163
-
164
- 1. Build the plugin:
165
- ```bash
166
- npm run build
167
- ```
168
-
169
- 2. Deploy to OpenCode's plugin cache:
170
- ```bash
171
- rm -rf ~/.cache/opencode/node_modules/opencode-codebase-index
172
- mkdir -p ~/.cache/opencode/node_modules/opencode-codebase-index
173
- cp -R dist native commands skill package.json ~/.cache/opencode/node_modules/opencode-codebase-index/
174
- ```
175
-
176
- 3. Create a loader in your test project:
177
- ```bash
178
- mkdir -p .opencode/plugin
179
- echo 'export { default } from "$HOME/.cache/opencode/node_modules/opencode-codebase-index/dist/index.js"' > .opencode/plugin/codebase-index.ts
180
- ```
181
-
182
- 4. Run `opencode` in your test project.
183
-
184
- ## Contributing
182
+ The plugin automatically detects available credentials in this order:
183
+ 1. **GitHub Copilot** (Free if you have it)
184
+ 2. **OpenAI** (Standard Embeddings)
185
+ 3. **Google** (Gemini Embeddings)
186
+ 4. **Ollama** (Local/Private - requires `nomic-embed-text`)
187
+
188
+ ## ⚠️ Tradeoffs
189
+
190
+ Be aware of these characteristics:
191
+
192
+ | Aspect | Reality |
193
+ |--------|---------|
194
+ | **Search latency** | ~800-1000ms per query (embedding API call) |
195
+ | **First index** | Takes time depending on codebase size (e.g., ~30s for 500 chunks) |
196
+ | **Requires API** | Needs an embedding provider (Copilot, OpenAI, Google, or local Ollama) |
197
+ | **Token costs** | Uses embedding tokens (free with Copilot, minimal with others) |
198
+ | **Best for** | Discovery and exploration, not exhaustive matching |
199
+
200
+ ## 💻 Local Development
201
+
202
+ 1. **Build**:
203
+ ```bash
204
+ npm run build
205
+ ```
206
+
207
+ 2. **Deploy to OpenCode Cache**:
208
+ ```bash
209
+ # Deploy script
210
+ rm -rf ~/.cache/opencode/node_modules/opencode-codebase-index
211
+ mkdir -p ~/.cache/opencode/node_modules/opencode-codebase-index
212
+ cp -R dist native commands skill package.json ~/.cache/opencode/node_modules/opencode-codebase-index/
213
+ ```
214
+
215
+ 3. **Register in Test Project**:
216
+ ```bash
217
+ mkdir -p .opencode/plugin
218
+ echo 'export { default } from "$HOME/.cache/opencode/node_modules/opencode-codebase-index/dist/index.js"' > .opencode/plugin/codebase-index.ts
219
+ ```
220
+
221
+ ## 🤝 Contributing
185
222
 
186
223
  1. Fork the repository
187
224
  2. Create a feature branch: `git checkout -b feature/my-feature`
188
- 3. Make your changes
189
- 4. Run the build: `npm run build`
190
- 5. Test locally using the steps above
191
- 6. Commit your changes: `git commit -m "feat: add my feature"`
192
- 7. Push to your fork: `git push origin feature/my-feature`
193
- 8. Open a pull request
225
+ 3. Make your changes and add tests
226
+ 4. Run checks: `npm run build && npm run test:run && npm run lint`
227
+ 5. Commit: `git commit -m "feat: add my feature"`
228
+ 6. Push and open a pull request
194
229
 
195
230
  CI will automatically run tests and type checking on your PR.
196
231
 
@@ -198,35 +233,30 @@ CI will automatically run tests and type checking on your PR.
198
233
 
199
234
  ```
200
235
  ├── src/
201
- │ ├── index.ts # Plugin entry point
202
- │ ├── config/ # Configuration schema
203
- │ ├── embeddings/ # Embedding provider detection and API
204
- │ ├── indexer/ # Core indexing logic
205
- │ ├── tools/ # OpenCode tool definitions
206
- │ ├── utils/ # File collection, cost estimation
207
- │ ├── native/ # Rust native module wrapper
208
- │ └── watcher/ # File change watcher
236
+ │ ├── index.ts # Plugin entry point
237
+ │ ├── config/ # Configuration schema
238
+ │ ├── embeddings/ # Provider detection and API calls
239
+ │ ├── indexer/ # Core indexing logic + inverted index
240
+ │ ├── tools/ # OpenCode tool definitions
241
+ │ ├── utils/ # File collection, cost estimation
242
+ │ ├── native/ # Rust native module wrapper
243
+ │ └── watcher/ # File change watcher
209
244
  ├── native/
210
- │ └── src/ # Rust native module (tree-sitter, usearch)
211
- ├── tests/ # Unit tests (vitest)
212
- ├── commands/ # Slash command definitions
213
- ├── skill/ # Agent skill guidance
214
- └── .github/workflows/ # CI/CD (test, build, publish)
245
+ │ └── src/ # Rust: tree-sitter, usearch, xxhash
246
+ ├── tests/ # Unit tests (vitest)
247
+ ├── commands/ # Slash command definitions
248
+ ├── skill/ # Agent skill guidance
249
+ └── .github/workflows/ # CI/CD (test, build, publish)
215
250
  ```
216
251
 
217
252
  ### Native Module
218
253
 
219
- The Rust native module handles:
220
- - Tree-sitter parsing for semantic chunking
221
- - xxHash for fast file hashing
222
- - usearch for vector storage and similarity search
223
-
224
- To rebuild the native module:
225
- ```bash
226
- npm run build:native
227
- ```
254
+ The Rust native module handles performance-critical operations:
255
+ - **tree-sitter**: Language-aware code parsing
256
+ - **usearch**: High-performance vector similarity search
257
+ - **xxhash**: Fast content hashing for change detection
228
258
 
229
- Requires Rust toolchain installed.
259
+ Rebuild with: `npm run build:native` (requires Rust toolchain)
230
260
 
231
261
  ## License
232
262