opencode-codebase-index 0.1.11 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -14,6 +14,7 @@
14
14
 
15
15
  - 🧠 **Semantic Search**: Finds "user authentication" logic even if the function is named `check_creds`.
16
16
  - ⚡ **Blazing Fast Indexing**: Powered by a Rust native module using `tree-sitter` and `usearch`. Incremental updates take milliseconds.
17
+ - 🌿 **Branch-Aware**: Seamlessly handles git branch switches — reuses embeddings, filters stale results.
17
18
  - 🔒 **Privacy Focused**: Your vector index is stored locally in your project.
18
19
  - 🔌 **Model Agnostic**: Works out-of-the-box with GitHub Copilot, OpenAI, Gemini, or local Ollama models.
19
20
 
@@ -31,11 +32,12 @@
31
32
  }
32
33
  ```
33
34
 
34
- 3. **Start Searching**
35
- Load OpenCode and ask:
36
- > "Find the function that handles credit card validation errors"
35
+ 3. **Index your codebase**
36
+ Run `/index` or ask the agent to index your codebase. This only needs to be done once — subsequent updates are incremental.
37
37
 
38
- *The plugin will automatically index your codebase on the first run.*
38
+ 4. **Start Searching**
39
+ Ask:
40
+ > "Find the function that handles credit card validation errors"
39
41
 
40
42
  ## 🔍 See It In Action
41
43
 
@@ -98,13 +100,16 @@ graph TD
98
100
  A[Source Code] -->|Tree-sitter| B[Semantic Chunks]
99
101
  B -->|Embedding Model| C[Vectors]
100
102
  C -->|uSearch| D[(Vector Store)]
103
+ C -->|SQLite| G[(Embeddings DB)]
101
104
  B -->|BM25| E[(Inverted Index)]
105
+ B -->|Branch Catalog| G
102
106
  end
103
107
 
104
108
  subgraph Searching
105
109
  Q[User Query] -->|Embedding Model| V[Query Vector]
106
110
  V -->|Cosine Similarity| D
107
111
  Q -->|BM25| E
112
+ G -->|Branch Filter| F
108
113
  D --> F[Hybrid Fusion]
109
114
  E --> F
110
115
  F --> R[Ranked Results]
@@ -114,14 +119,52 @@ graph TD
114
119
  1. **Parsing**: We use `tree-sitter` to intelligently parse your code into meaningful blocks (functions, classes, interfaces). JSDoc comments and docstrings are automatically included with their associated code.
115
120
  2. **Chunking**: Large blocks are split with overlapping windows to preserve context across chunk boundaries.
116
121
  3. **Embedding**: These blocks are converted into vector representations using your configured AI provider.
117
- 4. **Storage**: Vectors are stored in a high-performance local index using `usearch` with F16 quantization for 50% memory savings.
118
- 5. **Hybrid Search**: Combines semantic similarity (vectors) with BM25 keyword matching for best results.
122
+ 4. **Storage**: Embeddings are stored in SQLite (deduplicated by content hash) and vectors in `usearch` with F16 quantization for 50% memory savings. A branch catalog tracks which chunks exist on each branch.
123
+ 5. **Hybrid Search**: Combines semantic similarity (vectors) with BM25 keyword matching, filtered by current branch.
119
124
 
120
125
  **Performance characteristics:**
121
126
  - **Incremental indexing**: ~50ms check time — only re-embeds changed files
122
127
  - **Smart chunking**: Understands code structure to keep functions whole, with overlap for context
123
128
  - **Native speed**: Core logic written in Rust for maximum performance
124
129
  - **Memory efficient**: F16 vector quantization reduces index size by 50%
130
+ - **Branch-aware**: Automatically tracks which chunks exist on each git branch
131
+
132
+ ## 🌿 Branch-Aware Indexing
133
+
134
+ The plugin automatically detects git branches and optimizes indexing across branch switches.
135
+
136
+ ### How It Works
137
+
138
+ When you switch branches, code changes but embeddings for unchanged content remain the same. The plugin:
139
+
140
+ 1. **Stores embeddings by content hash**: Embeddings are deduplicated across branches
141
+ 2. **Tracks branch membership**: A lightweight catalog tracks which chunks exist on each branch
142
+ 3. **Filters search results**: Queries only return results relevant to the current branch
143
+
144
+ ### Benefits
145
+
146
+ | Scenario | Without Branch Awareness | With Branch Awareness |
147
+ |----------|-------------------------|----------------------|
148
+ | Switch to feature branch | Re-index everything | Instant — reuse existing embeddings |
149
+ | Return to main | Re-index everything | Instant — catalog already exists |
150
+ | Search on branch | May return stale results | Only returns current branch's code |
151
+
152
+ ### Automatic Behavior
153
+
154
+ - **Branch detection**: Automatically reads from `.git/HEAD`
155
+ - **Re-indexing on switch**: Triggers when you switch branches (via file watcher)
156
+ - **Legacy migration**: Automatically migrates old indexes on first run
157
+ - **Garbage collection**: Health check removes orphaned embeddings and chunks
158
+
159
+ ### Storage Structure
160
+
161
+ ```
162
+ .opencode/index/
163
+ ├── codebase.db # SQLite: embeddings, chunks, branch catalog
164
+ ├── vectors.usearch # Vector index (uSearch)
165
+ ├── inverted-index.json # BM25 keyword index
166
+ └── file-hashes.json # File change detection
167
+ ```
125
168
 
126
169
  ## 🧰 Tools Available
127
170
 
@@ -151,7 +194,7 @@ Manually trigger indexing.
151
194
  Checks if the index is ready and healthy.
152
195
 
153
196
  ### `index_health_check`
154
- Maintenance tool to remove stale entries from deleted files.
197
+ Maintenance tool to remove stale entries from deleted files and orphaned embeddings/chunks from the database.
155
198
 
156
199
  ## 🎮 Slash Commands
157
200
 
@@ -263,12 +306,13 @@ CI will automatically run tests and type checking on your PR.
263
306
  │ ├── config/ # Configuration schema
264
307
  │ ├── embeddings/ # Provider detection and API calls
265
308
  │ ├── indexer/ # Core indexing logic + inverted index
309
+ │ ├── git/ # Git utilities (branch detection)
266
310
  │ ├── tools/ # OpenCode tool definitions
267
311
  │ ├── utils/ # File collection, cost estimation
268
312
  │ ├── native/ # Rust native module wrapper
269
- │ └── watcher/ # File change watcher
313
+ │ └── watcher/ # File/git change watcher
270
314
  ├── native/
271
- │ └── src/ # Rust: tree-sitter, usearch, xxhash
315
+ │ └── src/ # Rust: tree-sitter, usearch, xxhash, SQLite
272
316
  ├── tests/ # Unit tests (vitest)
273
317
  ├── commands/ # Slash command definitions
274
318
  ├── skill/ # Agent skill guidance
@@ -280,6 +324,7 @@ CI will automatically run tests and type checking on your PR.
280
324
  The Rust native module handles performance-critical operations:
281
325
  - **tree-sitter**: Language-aware code parsing with JSDoc/docstring extraction
282
326
  - **usearch**: High-performance vector similarity search with F16 quantization
327
+ - **SQLite**: Persistent storage for embeddings, chunks, and branch catalog
283
328
  - **BM25 inverted index**: Fast keyword search for hybrid retrieval
284
329
  - **xxhash**: Fast content hashing for change detection
285
330