opencode-codebase-index 0.1.8 → 0.1.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Kenneth Helweg
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md CHANGED
@@ -68,6 +68,28 @@ src/api/checkout.ts:89 (Route handler for /pay)
68
68
 
69
69
  **Rule of thumb**: Semantic search for discovery → grep for precision.
70
70
 
71
+ ## 📊 Token Usage
72
+
73
+ In our testing across open-source codebases (axios, express), we observed **up to 90% reduction in token usage** for conceptual queries like *"find the error handling middleware"*.
74
+
75
+ ### Why It Saves Tokens
76
+
77
+ - **Without plugin**: Agent explores files, reads code, backtracks, explores more
78
+ - **With plugin**: Semantic search returns relevant code immediately → less exploration
79
+
80
+ ### Key Takeaways
81
+
82
+ 1. **Significant savings possible**: Up to 90% reduction in the best cases
83
+ 2. **Results vary**: Savings depend on query type, codebase structure, and agent behavior
84
+ 3. **Best for discovery**: Conceptual queries benefit most; exact identifier lookups should use grep
85
+ 4. **Complements existing tools**: Provides a faster initial signal, doesn't replace grep/explore
86
+
87
+ ### When the Plugin Helps Most
88
+
89
+ - **Conceptual queries**: "Where is the authentication logic?" (no keywords to grep for)
90
+ - **Unfamiliar codebases**: You don't know what to search for yet
91
+ - **Large codebases**: Semantic search scales better than exhaustive exploration
92
+
71
93
  ## 🛠️ How It Works
72
94
 
73
95
  ```mermaid
@@ -75,25 +97,31 @@ graph TD
75
97
  subgraph Indexing
76
98
  A[Source Code] -->|Tree-sitter| B[Semantic Chunks]
77
99
  B -->|Embedding Model| C[Vectors]
78
- C -->|uSearch| D[(Local Vector Store)]
100
+ C -->|uSearch| D[(Vector Store)]
101
+ B -->|BM25| E[(Inverted Index)]
79
102
  end
80
103
 
81
104
  subgraph Searching
82
105
  Q[User Query] -->|Embedding Model| V[Query Vector]
83
106
  V -->|Cosine Similarity| D
84
- D --> R[Ranked Results]
107
+ Q -->|BM25| E
108
+ D --> F[Hybrid Fusion]
109
+ E --> F
110
+ F --> R[Ranked Results]
85
111
  end
86
112
  ```
87
113
 
88
- 1. **Parsing**: We use `tree-sitter` to intelligently parse your code into meaningful blocks (functions, classes, interfaces).
89
- 2. **Embedding**: These blocks are converted into vector representations using your configured AI provider.
90
- 3. **Storage**: Vectors are stored in a high-performance local index using `usearch`.
91
- 4. **Search**: Your natural language queries are matched against this index to find the most semantically relevant code.
114
+ 1. **Parsing**: We use `tree-sitter` to intelligently parse your code into meaningful blocks (functions, classes, interfaces). JSDoc comments and docstrings are automatically included with their associated code.
115
+ 2. **Chunking**: Large blocks are split with overlapping windows to preserve context across chunk boundaries.
116
+ 3. **Embedding**: These blocks are converted into vector representations using your configured AI provider.
117
+ 4. **Storage**: Vectors are stored in a high-performance local index using `usearch` with F16 quantization for 50% memory savings.
118
+ 5. **Hybrid Search**: Combines semantic similarity (vectors) with BM25 keyword matching for best results.
92
119
 
93
120
  **Performance characteristics:**
94
121
  - **Incremental indexing**: ~50ms check time — only re-embeds changed files
95
- - **Smart chunking**: Understands code structure to keep functions whole
122
+ - **Smart chunking**: Understands code structure to keep functions whole, with overlap for context
96
123
  - **Native speed**: Core logic written in Rust for maximum performance
124
+ - **Memory efficient**: F16 vector quantization reduces index size by 50%
97
125
 
98
126
  ## 🧰 Tools Available
99
127
 
@@ -117,7 +145,7 @@ The plugin exposes these tools to the OpenCode agent:
117
145
  ### `index_codebase`
118
146
  Manually trigger indexing.
119
147
  - **Use for**: Forcing a re-index or checking stats.
120
- - **Parameters**: `force` (rebuild all), `estimateOnly` (check costs).
148
+ - **Parameters**: `force` (rebuild all), `estimateOnly` (check costs), `verbose` (show skipped files and parse failures).
121
149
 
122
150
  ### `index_status`
123
151
  Checks if the index is ready and healthy.
@@ -127,12 +155,7 @@ Maintenance tool to remove stale entries from deleted files.
127
155
 
128
156
  ## 🎮 Slash Commands
129
157
 
130
- For easier access, you can add slash commands to your project.
131
-
132
- Copy the commands:
133
- ```bash
134
- cp -r node_modules/opencode-codebase-index/commands/* .opencode/command/
135
- ```
158
+ The plugin automatically registers these slash commands:
136
159
 
137
160
  | Command | Description |
138
161
  | ------- | ----------- |
@@ -151,7 +174,9 @@ Zero-config by default (uses `auto` mode). Customize in `.opencode/codebase-inde
151
174
  "indexing": {
152
175
  "autoIndex": false,
153
176
  "watchFiles": true,
154
- "maxFileSize": 1048576
177
+ "maxFileSize": 1048576,
178
+ "maxChunksPerFile": 100,
179
+ "semanticOnly": false
155
180
  },
156
181
  "search": {
157
182
  "maxResults": 20,
@@ -172,6 +197,10 @@ Zero-config by default (uses `auto` mode). Customize in `.opencode/codebase-inde
172
197
  | `autoIndex` | `false` | Automatically index on plugin load |
173
198
  | `watchFiles` | `true` | Re-index when files change |
174
199
  | `maxFileSize` | `1048576` | Skip files larger than this (bytes). Default: 1MB |
200
+ | `maxChunksPerFile` | `100` | Maximum chunks to index per file (controls token costs for large files) |
201
+ | `semanticOnly` | `false` | When `true`, only index semantic nodes (functions, classes) and skip generic blocks |
202
+ | `retries` | `3` | Number of retry attempts for failed embedding API calls |
203
+ | `retryDelayMs` | `1000` | Delay between retries in milliseconds |
175
204
  | **search** | | |
176
205
  | `maxResults` | `20` | Maximum results to return |
177
206
  | `minScore` | `0.1` | Minimum similarity score (0-1). Lower = more results |
@@ -204,19 +233,16 @@ Be aware of these characteristics:
204
233
  npm run build
205
234
  ```
206
235
 
207
- 2. **Deploy to OpenCode Cache**:
208
- ```bash
209
- # Deploy script
210
- rm -rf ~/.cache/opencode/node_modules/opencode-codebase-index
211
- mkdir -p ~/.cache/opencode/node_modules/opencode-codebase-index
212
- cp -R dist native commands skill package.json ~/.cache/opencode/node_modules/opencode-codebase-index/
236
+ 2. **Register in Test Project** (use `file://` URL in `opencode.json`):
237
+ ```json
238
+ {
239
+ "plugin": [
240
+ "file:///path/to/opencode-codebase-index"
241
+ ]
242
+ }
213
243
  ```
214
-
215
- 3. **Register in Test Project**:
216
- ```bash
217
- mkdir -p .opencode/plugin
218
- echo 'export { default } from "$HOME/.cache/opencode/node_modules/opencode-codebase-index/dist/index.js"' > .opencode/plugin/codebase-index.ts
219
- ```
244
+
245
+ This loads directly from your source directory, so changes take effect after rebuilding.
220
246
 
221
247
  ## 🤝 Contributing
222
248
 
@@ -252,8 +278,9 @@ CI will automatically run tests and type checking on your PR.
252
278
  ### Native Module
253
279
 
254
280
  The Rust native module handles performance-critical operations:
255
- - **tree-sitter**: Language-aware code parsing
256
- - **usearch**: High-performance vector similarity search
281
+ - **tree-sitter**: Language-aware code parsing with JSDoc/docstring extraction
282
+ - **usearch**: High-performance vector similarity search with F16 quantization
283
+ - **BM25 inverted index**: Fast keyword search for hybrid retrieval
257
284
  - **xxhash**: Fast content hashing for change detection
258
285
 
259
286
  Rebuild with: `npm run build:native` (requires Rust toolchain)