@iceinvein/code-intelligence-mcp 1.3.6 → 1.3.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +52 -6
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -8,16 +8,16 @@
|
|
|
8
8
|
|
|
9
9
|
---
|
|
10
10
|
|
|
11
|
-
This server indexes your codebase locally to provide **fast, semantic, and structure-aware** code navigation to tools like OpenCode, Trae, and Cursor.
|
|
11
|
+
This server indexes your codebase locally to provide **fast, semantic, and structure-aware** code navigation to tools like Claude Code, OpenCode, Trae, and Cursor.
|
|
12
12
|
|
|
13
13
|
## Why Use This Server?
|
|
14
14
|
|
|
15
15
|
Unlike basic text search, this server builds a local knowledge graph to understand your code.
|
|
16
16
|
|
|
17
|
-
* **Advanced Hybrid Search**: Combines
|
|
17
|
+
* **Advanced Hybrid Search**: Combines keyword search ([BM25](#glossary) via Tantivy) with semantic vector search (via LanceDB + jina-code-embeddings-0.5b) using [Reciprocal Rank Fusion (RRF)](#glossary) — a technique that merges ranked results from different search systems by position rather than raw score.
|
|
18
18
|
* **Smart Context Assembly**: Token-aware budgeting with query-aware truncation that keeps relevant lines within context limits.
|
|
19
|
-
* **On-Device LLM Descriptions**: Automatically generates natural-language descriptions for every symbol using a local **Qwen2.5-Coder-1.5B** model (llama.cpp with Metal GPU), enriching search with human-readable summaries.
|
|
20
|
-
* **PageRank Scoring**: Graph-based symbol importance scoring that identifies central, heavily-used components.
|
|
19
|
+
* **On-Device LLM Descriptions**: Automatically generates natural-language descriptions for every symbol using a local **Qwen2.5-Coder-1.5B** model (llama.cpp with Metal GPU), enriching search with human-readable summaries. This bridges the vocabulary gap between how developers search ("auth handler") and how code is named (`authenticate_request`).
|
|
20
|
+
* **PageRank Scoring**: Graph-based symbol importance scoring (similar to Google's original algorithm) that identifies central, heavily-used components by analyzing call graphs and type relationships.
|
|
21
21
|
* **Learns from Feedback**: Optional learning system that adapts to user selections over time.
|
|
22
22
|
* **Production First**: Multi-layer test detection (file paths, symbol names, and AST-level `#[test]`/`mod tests` analysis) ensures implementation code ranks above test helpers.
|
|
23
23
|
* **Multi-Repo Support**: Index and search across multiple repositories/monorepos simultaneously.
|
|
@@ -30,6 +30,30 @@ Unlike basic text search, this server builds a local knowledge graph to understa
|
|
|
30
30
|
|
|
31
31
|
Runs directly via `npx` without requiring a local Rust toolchain.
|
|
32
32
|
|
|
33
|
+
### Claude Code
|
|
34
|
+
|
|
35
|
+
Add to your MCP settings (global `~/.claude.json` or project-level `.mcp.json`):
|
|
36
|
+
|
|
37
|
+
```json
|
|
38
|
+
{
|
|
39
|
+
"mcpServers": {
|
|
40
|
+
"code-intelligence": {
|
|
41
|
+
"command": "npx",
|
|
42
|
+
"args": ["-y", "@iceinvein/code-intelligence-mcp"],
|
|
43
|
+
"env": {}
|
|
44
|
+
}
|
|
45
|
+
}
|
|
46
|
+
}
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
Or install via the CLI:
|
|
50
|
+
|
|
51
|
+
```bash
|
|
52
|
+
claude mcp add code-intelligence -- npx -y @iceinvein/code-intelligence-mcp
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
Once connected, Claude Code gains 23 MCP tools for semantic search (`search_code`), symbol navigation (`get_definition`, `find_references`), call/type graphs (`get_call_hierarchy`, `get_type_graph`), impact analysis (`find_affected_code`, `trace_data_flow`), and more. The server auto-detects the working directory and begins indexing in the background.
|
|
56
|
+
|
|
33
57
|
### OpenCode / Trae
|
|
34
58
|
|
|
35
59
|
Add to your `opencode.json` (or global config):
|
|
@@ -77,7 +101,7 @@ CIMCP_MODE=standalone ./target/release/code-intelligence-mcp-server
|
|
|
77
101
|
|
|
78
102
|
Point your MCP clients to the standalone server using Streamable HTTP transport:
|
|
79
103
|
|
|
80
|
-
**Claude Code** (`~/.claude
|
|
104
|
+
**Claude Code** (`~/.claude.json` or project-level `.mcp.json`):
|
|
81
105
|
```json
|
|
82
106
|
{
|
|
83
107
|
"mcpServers": {
|
|
@@ -89,6 +113,11 @@ Point your MCP clients to the standalone server using Streamable HTTP transport:
|
|
|
89
113
|
}
|
|
90
114
|
```
|
|
91
115
|
|
|
116
|
+
Or via the CLI:
|
|
117
|
+
```bash
|
|
118
|
+
claude mcp add --transport http code-intelligence http://localhost:3333/mcp
|
|
119
|
+
```
|
|
120
|
+
|
|
92
121
|
**OpenCode** (`opencode.json`):
|
|
93
122
|
```json
|
|
94
123
|
{
|
|
@@ -258,7 +287,7 @@ The server supports semantic navigation and symbol extraction for the following
|
|
|
258
287
|
|
|
259
288
|
## Smart Ranking & Context Enhancement
|
|
260
289
|
|
|
261
|
-
The ranking engine
|
|
290
|
+
The search pipeline runs two parallel searches — keyword (BM25 via Tantivy) and semantic (vector embeddings via LanceDB) — then merges them using Reciprocal Rank Fusion (RRF). On top of this hybrid base, the ranking engine applies structural signals to optimize for relevance:
|
|
262
291
|
|
|
263
292
|
1. **PageRank Symbol Importance**: Graph-based scoring that identifies central, heavily-used components (similar to Google's PageRank).
|
|
264
293
|
2. **Reciprocal Rank Fusion (RRF)**: Combines keyword, vector, and graph search results using statistically optimal rank fusion.
|
|
@@ -291,6 +320,23 @@ For a deep dive into the system's design, see [System Architecture](SYSTEM_ARCHI
|
|
|
291
320
|
|
|
292
321
|
---
|
|
293
322
|
|
|
323
|
+
## Glossary
|
|
324
|
+
|
|
325
|
+
Key terms used throughout this documentation:
|
|
326
|
+
|
|
327
|
+
| Term | Full Name | What It Means |
|
|
328
|
+
|------|-----------|---------------|
|
|
329
|
+
| **MCP** | Model Context Protocol | An open protocol for connecting LLM-based tools (like Claude Code, Cursor, OpenCode) to external data sources and capabilities. This server implements MCP to expose code search and navigation tools. |
|
|
330
|
+
| **BM25** | Best Matching 25 | A probabilistic text search algorithm (used by Tantivy). Ranks results by how often your search terms appear in a document (term frequency) weighted by how rare those terms are across all documents (inverse document frequency / IDF). The standard algorithm behind most full-text search engines. |
|
|
331
|
+
| **IDF** | Inverse Document Frequency | A component of BM25 that measures how rare a term is. A term like `authenticate` appearing in only 3 files has high IDF (very discriminating), while `error` appearing in 200 files has low IDF (less useful for ranking). |
|
|
332
|
+
| **RRF** | Reciprocal Rank Fusion | A technique for merging ranked result lists from different search systems. Instead of comparing raw scores (which have different scales), RRF uses rank positions: a result ranked #1 in keyword search and #3 in vector search gets a combined score based on those positions. This makes it robust when combining fundamentally different search approaches. |
|
|
333
|
+
| **GGUF** | GGML Unified Format | A binary format for storing quantized (compressed) neural network weights. Used by llama.cpp to run both the embedding model and the LLM efficiently on consumer hardware. Q4_K_M quantization reduces the 1.5B parameter model from ~3GB to ~1.1GB with minimal quality loss. |
|
|
334
|
+
| **LLM** | Large Language Model | In this project, a local Qwen2.5-Coder-1.5B model that generates one-sentence natural-language descriptions for each code symbol (function, class, type). These descriptions are indexed alongside the code, helping BM25 match natural-language queries to technically-named code. |
|
|
335
|
+
| **PageRank** | — | A graph algorithm (originally from Google Search) adapted here to score symbol importance. Symbols that are called/referenced by many other symbols get higher PageRank scores, indicating they are central to the codebase. |
|
|
336
|
+
| **Tree-Sitter** | — | A parser generator that builds concrete syntax trees (CSTs) for source code. Used to extract symbols (functions, classes, types), their relationships (calls, imports, type hierarchies), and structural information from 8 supported languages. |
|
|
337
|
+
|
|
338
|
+
---
|
|
339
|
+
|
|
294
340
|
## Configuration (Optional)
|
|
295
341
|
|
|
296
342
|
Works without configuration by default. You can customize behavior via environment variables:
|