opencode-semantic-search 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/AGENTS.md ADDED
@@ -0,0 +1,165 @@
1
+ # AGENTS.md — Guide for Coding Agents
2
+
3
+ This file provides conventions and commands for agents working in this repository.
4
+
5
+ ## Project Overview
6
+
7
+ A semantic search plugin for [OpenCode AI](https://github.com/opencode-ai/opencode). It indexes code files using tree-sitter chunking and embeddings (OpenAI / Ollama), stores vectors in SQLite (sqlite-vec), and exposes search tools via the plugin API.
8
+
9
+ **Runtime:** Bun (not Node.js)
10
+ **Language:** TypeScript, ESM (`"type": "module"`)
11
+ **Entry point:** `index.ts`
12
+
13
+ ## Commands
14
+
15
+ ```bash
16
+ # Type-check the entire project
17
+ bun run typecheck # alias: bun run check
18
+
19
+ # Run the plugin (typically via OpenCode host)
20
+ bun run index.ts
21
+
22
+ # Install dependencies
23
+ bun install
24
+ ```
25
+
26
+ - **No build step** — Bun runs TypeScript directly.
27
+ - **No test framework** is configured. If you add tests, prefer `bun test` (built-in runner) with `*.test.ts` files co-located in `src/`.
28
+ - **No linter/formatter** is configured. Follow the existing style (see below).
29
+
30
+ ## Project Structure
31
+
32
+ ```
33
+ index.ts # Plugin entrypoint — wires tools + events
34
+ src/
35
+ config.ts # Plugin config schema + loader
36
+ runtime.ts # Runtime setup helpers
37
+ types.d.ts # Ambient type declarations
38
+ embedder/ # Embedding providers (openai.ts, ollama.ts, interface.ts)
39
+ chunker/ # Code chunking (tree-sitter + line-based fallback)
40
+ search/ # Hybrid search (vector + BM25 ranking, context building)
41
+ store/ # SQLite wrapper, schema.sql, FTS integration
42
+ indexer/ # Incremental indexing, delta scanning, GC, resume
43
+ tools/ # Plugin tools: semantic_search, smart_grep, reindex, index_status
44
+ ```
45
+
46
+ ## Code Style
47
+
48
+ ### Formatting
49
+ - **Double quotes** for strings (`"foo"`, not `'foo'`)
50
+ - **Semicolons** at end of statements
51
+ - **No trailing commas** in function params (match existing code)
52
+ - ~100 char soft line limit; wrap long signatures naturally
53
+
54
+ ### Imports
55
+ - Use `import type { ... }` for type-only imports (enforced by `verbatimModuleSyntax` in tsconfig)
56
+ - Use `import { ... } from "..."` for value imports
57
+ - Bare specifiers for npm packages; relative paths for local modules
58
+ - No barrel files — import directly from the module that exports
59
+
60
+ ### Types
61
+ - `strict: true` is on — all code must type-check cleanly
62
+ - Prefer explicit return types on exported functions
63
+ - Use `interface` for object shapes, `type` for unions/intersections
64
+ - Avoid `any` — use `unknown` + narrowing, or specific types
65
+ - `noUncheckedIndexedAccess: true` — always guard indexed access
66
+
67
+ ### Naming
68
+ - **Files:** `snake_case.ts` (e.g., `semantic_search.ts`, `smart_grep.ts`)
69
+ - **Variables/functions:** `camelCase`
70
+ - **Types/interfaces/classes:** `PascalCase`
71
+ - **Constants:** `camelCase` (no SCREAMING_CASE unless truly global config)
72
+ - **SQL files:** `snake_case.sql`
73
+
74
+ ### Functions & Error Handling
75
+ - Prefer `async/await` over raw Promises
76
+ - Use early returns to reduce nesting
77
+ - Wrap fallible operations in try/catch; log errors with context
78
+ - Fallback-first pattern is common — degrade gracefully (e.g., tree-sitter → line chunking, semantic → ripgrep)
79
+
80
+ ### General Patterns
81
+ - Small, focused modules — one concern per file
82
+ - Factory functions for provider selection (see `createEmbedder` in `src/embedder/interface.ts`)
83
+ - Class-based wrappers for stateful resources (see `SqliteStore` in `src/store/sqlite.ts`)
84
+ - Config-driven behavior via `src/config.ts` defaults + deep merge
85
+ - Keep utility functions local to the file that uses them; don't create shared util files unless truly shared
86
+
87
+ ## TypeScript Config Highlights
88
+
89
+ Key `tsconfig.json` settings to be aware of:
90
+
91
+ | Setting | Value | Impact |
92
+ |---|---|---|
93
+ | `strict` | `true` | Full strict mode |
94
+ | `verbatimModuleSyntax` | `true` | Must use `import type` for type imports |
95
+ | `noUncheckedIndexedAccess` | `true` | Index signatures return `T | undefined` |
96
+ | `noFallthroughCasesInSwitch` | `true` | No silent switch fallthrough |
97
+ | `noImplicitOverride` | `true` | Must use `override` keyword |
98
+ | `moduleResolution` | `bundler` | Modern ESM resolution |
99
+
100
+ Run `bun run typecheck` before submitting changes to catch all of the above.
101
+
102
+ ## Adding a New Tool
103
+
104
+ 1. Create `src/tools/your_tool.ts` exporting a tool definition
105
+ 2. Register it in the tools map in `index.ts`
106
+ 3. Follow the pattern of existing tools (see `semantic_search.ts` for reference)
107
+
108
+ ## Dependencies
109
+
110
+ - `@opencode-ai/plugin` — Plugin API (types + runtime)
111
+ - `web-tree-sitter` + `tree-sitter-wasms` — Syntax-aware code chunking
112
+ - `sqlite-vec` — Vector similarity search in SQLite
113
+ - `picomatch` — Glob pattern matching
114
+
115
+ ## Common Pitfalls
116
+
117
+ - **`verbatimModuleSyntax`** — importing a type without `import type` will fail at runtime. Always check: is this value used at runtime, or only in type positions?
118
+ - **`noUncheckedIndexedAccess`** — `arr[0]` returns `T | undefined`. Always guard with `if` or use `?.` before accessing properties.
119
+ - **Bun, not Node** — don't use `node:` prefixed imports unless the module requires it. Prefer standard ESM imports.
120
+ - **No barrel re-exports** — each module exports its own API. Import from the specific file, not an `index.ts` barrel.
121
+ - **Config merging** — config uses deep merge. Partial configs are fine; missing keys fall back to defaults in `src/config.ts`.
122
+
123
+ ## Working with the Plugin API
124
+
125
+ - Tools are defined as objects with `name`, `description`, `parameters`, and `execute` — see `@opencode-ai/plugin` types.
126
+ - Event handlers (`onFileChanged`, etc.) are registered in `index.ts`.
127
+ - The plugin receives a context object with access to the host filesystem and shell — use these instead of raw `fs`/`child_process`.
128
+ - **TUI feedback:** `ToolContext.metadata()` updates inline tool-call cards during `execute`. For transient OpenCode TUI pop-ups, use the SDK HTTP client (`runtime.opencodeClient.tui.showToast` → POST `/tui/show-toast`); the plugin uses `src/tui_toast.ts` for throttled **progress** toasts during indexing (phase changes + ~2.2s cadence) and for **completion** toasts (session sync summary, reindex done).
129
+
130
+ ## SQL Conventions
131
+
132
+ - Schema lives in `src/store/schema.sql` — edit there, then the `SqliteStore` class applies it on init.
133
+ - Use parameterized queries (`db.query(sql).all(params)`) — never interpolate user input into SQL.
134
+ - FTS5 tables mirror the main tables; keep column names in sync.
135
+
136
+ ## Debugging Tips
137
+
138
+ - Run `bun run typecheck` first — catches most issues before runtime.
139
+ - The plugin logs to stderr via the host; add `console.error(...)` for quick debugging.
140
+ - SQLite DB is stored in the OpenCode data directory — inspect with `sqlite3` CLI if needed.
141
+ - For embedding issues, test the provider directly (e.g., curl the Ollama/OpenAI endpoint) before debugging the plugin layer.
142
+
143
+ ## Verification Checklist
144
+
145
+ Before submitting changes:
146
+
147
+ 1. `bun run typecheck` — must pass with zero errors
148
+ 2. Review `import type` usage — runtime imports must not be type-only
149
+ 3. Guard all indexed access (`arr[i]`, `obj[key]`) for `undefined`
150
+ 4. Confirm new files follow `snake_case.ts` naming
151
+ 5. If adding a tool, verify it's registered in `index.ts`
152
+
153
+ ## Learned User Preferences
154
+
155
+ - For codebase search, prefer the plugin’s semantic path for conceptual or multi-word queries; rely on the ripgrep fallback for exact symbols, tight literals, or when the index or embedder is unavailable.
156
+
157
+ ## Learned Workspace Facts
158
+
159
+ - OpenCode does not load plugins from `package.json` alone; a TypeScript shim under the project `.opencode/plugins/` directory or under `~/.config/opencode/plugins/` must import this plugin’s entry so the host merges its tools (see `README.md` / `SETUP.md`). End users can run `bunx opencode-semantic-search@latest` (after npm publish) or `bash install.sh` from a clone to generate that shim.
160
+ - Tools registered on the plugin object use the keys `semantic_search`, `grep` (smart grep implementation in `src/tools/smart_grep.ts`), `index_status`, `reindex`, and `diagnostic_bundle`. Slash aliases (`/sem-status`, `/sem-search`, etc.) are purely LLM-driven: users add Markdown stubs under `.opencode/commands/` whose bodies tell the assistant which tool to call (see SETUP.md). The `command.execute.before` hook is not used because it does not suppress the LLM invocation in the current plugin API version.
161
+ - Resolve bundled assets (e.g. tree-sitter WASM and packages under this plugin’s `node_modules`) from the plugin install directory (e.g. paths anchored with `import.meta.dir`), not `process.cwd()`, because the OpenCode host’s working directory is the user’s open project.
162
+ - Config loading merges in order: built-in defaults, then optional `~/.config/opencode/semantic-search.json`, then optional `<worktree>/.opencode/semantic-search.json` (project overrides global for overlapping keys).
163
+ - Restart OpenCode after changing merged `semantic-search.json` so the plugin reloads options (e.g. `indexing.embed_concurrency`, embedding settings); config is not hot-reloaded from disk while the host is running.
164
+ - The logger supports a `log_file` config option (`logging.log_file` in `semantic-search.json`); it defaults to `~/.cache/opencode/semantic-search/plugin.log` and appends newline-delimited JSON entries. Set it to `null` or `""` in a config override to disable file logging.
165
+ - The plugin uses `experimental.chat.system.transform` (fires on every LLM turn) to inject a live `## Semantic Code Search` block into the system prompt with index stats and usage guidance, in addition to `experimental.session.compacting` (fires only during context compaction). Both hooks live in `index.ts`.
package/README.md ADDED
@@ -0,0 +1,138 @@
1
+ # opencode-semantic-search
2
+
3
+ Local-first semantic search plugin for [OpenCode](https://opencode.ai), with smart `grep` routing and incremental indexing.
4
+
5
+ ## Quickstart
6
+
7
+ ### Prerequisites
8
+
9
+ - [Bun](https://bun.sh) 1.1+
10
+ - [OpenCode](https://opencode.ai)
11
+ - [ripgrep](https://github.com/BurntSushi/ripgrep) (`rg`)
12
+ - Either [Ollama](https://ollama.com) or an OpenAI-compatible embedding API
13
+
14
+ ### Install
15
+
16
+ **Option A — `bunx` (recommended after the package is on npm)**
17
+
18
+ Runs the same installer as `install.sh` from the published package (no clone):
19
+
20
+ ```bash
21
+ # Global shim (~/.config/opencode/plugins) — default
22
+ bunx opencode-semantic-search@latest
23
+
24
+ # Project-local only (./.opencode/plugins)
25
+ bunx opencode-semantic-search@latest install --local
26
+ ```
27
+
28
+ Other flags match `install.sh` (see below).
29
+
30
+ **Option B — Remote `install.sh` (curl)**
31
+
32
+ ```bash
33
+ curl -fsSL https://raw.githubusercontent.com/jainprashul/opencode-semantic-search/main/install.sh | bash
34
+ ```
35
+
36
+ Use `bash -s -- --local` after the pipe for a project-local shim instead of the default global install.
37
+
38
+ **Option C — Git clone**
39
+
40
+ ```bash
41
+ git clone https://github.com/jainprashul/opencode-semantic-search.git
42
+ cd opencode-semantic-search
43
+
44
+ # Global (default): ~/.config/opencode/plugins
45
+ bash install.sh
46
+
47
+ # Project-local only: ./.opencode/plugins
48
+ bash install.sh --local
49
+ ```
50
+
51
+ Common installer modes (clone or remote script):
52
+
53
+ ```bash
54
+ # explicit global install (default)
55
+ bash install.sh --global
56
+
57
+ # project-local only
58
+ bash install.sh --local
59
+
60
+ # OpenAI embeddings instead of Ollama
61
+ bash install.sh --openai-key-env OPENAI_API_KEY --skip-ollama
62
+
63
+ # custom Ollama model
64
+ bash install.sh --ollama-model nomic-embed-text
65
+
66
+ # skip Ollama checks/model pull
67
+ bash install.sh --skip-ollama
68
+ ```
69
+
70
+ The script installs dependencies, writes a plugin shim, optionally pulls the Ollama model, writes config, and runs integration self-tests when the test scripts are present (git clone); npm installs skip that step.
71
+
72
+ ### Start
73
+
74
+ ```bash
75
+ # Ollama users
76
+ ollama serve
77
+
78
+ # open your codebase
79
+ cd /path/to/your/repo
80
+ opencode
81
+ ```
82
+
83
+ ## Available tools
84
+
85
+ - `semantic_search(query, top_k?, threshold?, path?)`
86
+ - `grep(pattern|query, ...)` smart route: conceptual -> semantic, exact/regex -> `rg`
87
+ - `index_status()` health and coverage stats
88
+ - `reindex()` full index rebuild
89
+ - `diagnostic_bundle()` JSON support bundle (provider, index, routing history)
90
+
91
+ Optional **slash aliases** (`/sem-status`, `/sem-search`, etc.): add Markdown stubs under `.opencode/commands/` as described in [SETUP.md](SETUP.md#8-using-the-plugin-in-opencode). Each stub body is an LLM prompt that asks the assistant to call the matching tool.
92
+
93
+ ## Verify it works
94
+
95
+ From this plugin repo:
96
+
97
+ ```bash
98
+ bun run check
99
+ bun run test:integration
100
+ bun run diagnostic:bundle
101
+ ```
102
+
103
+ Expected:
104
+
105
+ - `bun run check` exits successfully (no TypeScript errors).
106
+ - `bun run test:integration` prints JSON with `"ok":true` from both suites.
107
+ - `bun run diagnostic:bundle` prints one JSON bundle with provider health, index stats, DB path, and recent routing outcomes.
108
+
109
+ In OpenCode (after startup indexing):
110
+
111
+ - `index_status()` should report `provider_healthy: true`, `files_indexed > 0`, `total_chunks > 0`.
112
+ - `semantic_search("authentication flow")` should return JSON results with file paths and scores.
113
+ - `grep("auth retry flow")` should return scored semantic matches (`score=...`) when provider/index are healthy.
114
+
115
+ ## Debugging pointers
116
+
117
+ - No semantic results: ensure embedder is reachable (`ollama serve` or valid OpenAI key) and run `reindex()`.
118
+ - `grep` behaving like plain text search: this is expected for exact/regex/single-token patterns.
119
+ - Wrong results after changing embedding model/dimensions: run `reindex()`.
120
+ - Persistent index issues: remove the project DB under `~/.cache/opencode/semantic-search/<project-hash>/embeddings.db`.
121
+
122
+ ## Publishing (maintainers)
123
+
124
+ Bump `version` in `package.json`, then:
125
+
126
+ ```bash
127
+ bun run typecheck
128
+ npm publish --access public
129
+ ```
130
+
131
+ `prepublishOnly` runs `typecheck` automatically. The npm package name is [`opencode-semantic-search`](https://www.npmjs.com/package/opencode-semantic-search).
132
+
133
+ ## Docs
134
+
135
+ - `docs/ARCHITECTURE.md` architecture, data flow, index lifecycle, smart grep routing
136
+ - `docs/CONFIG.md` full configuration reference + install script behavior
137
+ - `docs/DEBUGGING.md` logging/diagnostics, troubleshooting, and verification playbook
138
+ - `SETUP.md` extended setup walkthrough and notes