@iceinvein/code-intelligence-mcp 1.4.0 → 1.5.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +172 -6
- package/package.json +2 -3
- package/bin/standalone.js +0 -26
package/README.md
CHANGED
|
@@ -22,6 +22,7 @@ Unlike basic text search, this server builds a local knowledge graph to understa
|
|
|
22
22
|
* **Production First**: Multi-layer test detection (file paths, symbol names, and AST-level `#[test]`/`mod tests` analysis) ensures implementation code ranks above test helpers.
|
|
23
23
|
* **Multi-Repo Support**: Index and search across multiple repositories/monorepos simultaneously.
|
|
24
24
|
* **OS-Native File Watching**: Uses the `notify` crate with macOS FSEvents for instant re-indexing on file changes.
|
|
25
|
+
* **Built-in Chat UI**: Optional ChatGPT-style web interface powered by a local **Qwen2.5-Coder-14B** model. Ask questions about your codebase in the browser with live tool-call visibility and streaming responses.
|
|
25
26
|
* **Fast & Local**: Written in **Rust** with Metal GPU acceleration on Apple Silicon. Parallel indexing with persistent caching.
|
|
26
27
|
|
|
27
28
|
---
|
|
@@ -221,6 +222,156 @@ warm_ttl_seconds = 300 # How long idle repos stay in memory
|
|
|
221
222
|
|
|
222
223
|
---
|
|
223
224
|
|
|
225
|
+
## Chat Mode (Experimental)
|
|
226
|
+
|
|
227
|
+
Chat mode adds a **ChatGPT-style web UI** for asking questions about your codebase directly in the browser. It runs a local **Qwen2.5-Coder-14B** model with full Metal GPU acceleration and uses the same search and navigation tools that MCP clients get — meaning search quality improvements automatically benefit the chat experience.
|
|
228
|
+
|
|
229
|
+
Chat mode requires standalone mode and Apple Silicon with at least 16GB of unified memory.
|
|
230
|
+
|
|
231
|
+
### Quick Start
|
|
232
|
+
|
|
233
|
+
```bash
|
|
234
|
+
# Start standalone server with chat enabled
|
|
235
|
+
npx @iceinvein/code-intelligence-mcp-standalone --chat
|
|
236
|
+
|
|
237
|
+
# Or from source
|
|
238
|
+
./target/release/code-intelligence-mcp-server --standalone --chat
|
|
239
|
+
|
|
240
|
+
# Custom ports
|
|
241
|
+
./target/release/code-intelligence-mcp-server --standalone --port 3333 --chat --chat-port 4000
|
|
242
|
+
|
|
243
|
+
# Via environment variables
|
|
244
|
+
CIMCP_MODE=standalone CIMCP_CHAT=true ./target/release/code-intelligence-mcp-server
|
|
245
|
+
```
|
|
246
|
+
|
|
247
|
+
Once started, open **http://127.0.0.1:3334** in your browser.
|
|
248
|
+
|
|
249
|
+
On first launch, the 14B model (~9GB) is downloaded from HuggingFace and cached at `~/.code-intelligence/models/qwen2.5-coder-14b-gguf/`. The MCP server starts immediately — the model loads in the background and the chat UI becomes available once loading completes (typically 2-5 minutes on first run, seconds on subsequent launches).
|
|
250
|
+
|
|
251
|
+
### How It Works
|
|
252
|
+
|
|
253
|
+
```mermaid
|
|
254
|
+
sequenceDiagram
|
|
255
|
+
participant Browser as Web UI
|
|
256
|
+
participant Chat as Chat Server (:3334)
|
|
257
|
+
participant Agent as Agent Loop
|
|
258
|
+
participant LLM as Qwen2.5-14B (Metal GPU)
|
|
259
|
+
participant Tools as MCP Tool Handlers
|
|
260
|
+
|
|
261
|
+
Browser->>Chat: POST /api/chat (messages + repo_path)
|
|
262
|
+
Chat-->>Browser: SSE stream opened
|
|
263
|
+
|
|
264
|
+
loop Up to 3 tool rounds
|
|
265
|
+
Agent->>LLM: Generate (full prompt)
|
|
266
|
+
LLM-->>Agent: Response with <tool_call> blocks
|
|
267
|
+
Agent-->>Browser: SSE: tool_call (tool name + args)
|
|
268
|
+
Agent->>Tools: Execute tool (search_code, get_definition, etc.)
|
|
269
|
+
Tools-->>Agent: Tool results (JSON)
|
|
270
|
+
Agent-->>Browser: SSE: tool_result (summary)
|
|
271
|
+
Note over Agent: Append results to conversation, next round
|
|
272
|
+
end
|
|
273
|
+
|
|
274
|
+
Agent->>LLM: Generate stream (final response)
|
|
275
|
+
LLM-->>Agent: Tokens (one at a time)
|
|
276
|
+
Agent-->>Browser: SSE: token (streamed)
|
|
277
|
+
Agent-->>Browser: SSE: done
|
|
278
|
+
```
|
|
279
|
+
|
|
280
|
+
The agent uses up to **3 rounds** of tool calling before producing a final streamed response. Each round, the LLM can invoke any combination of 10 code intelligence tools to gather context before answering.
|
|
281
|
+
|
|
282
|
+
### Available Tools
|
|
283
|
+
|
|
284
|
+
The chat agent has access to a curated subset of the full MCP tool suite:
|
|
285
|
+
|
|
286
|
+
| Tool | Purpose |
|
|
287
|
+
| :--- | :------ |
|
|
288
|
+
| `search_code` | Hybrid semantic + keyword search |
|
|
289
|
+
| `get_definition` | Jump to symbol source code |
|
|
290
|
+
| `find_references` | Find all usages of a symbol |
|
|
291
|
+
| `get_call_hierarchy` | Navigate callers and callees |
|
|
292
|
+
| `get_type_graph` | Explore type inheritance |
|
|
293
|
+
| `explore_dependency_graph` | Trace module imports/exports |
|
|
294
|
+
| `get_file_symbols` | List all symbols in a file |
|
|
295
|
+
| `find_affected_code` | Impact analysis (reverse dependencies) |
|
|
296
|
+
| `trace_data_flow` | Follow variable reads and writes |
|
|
297
|
+
| `summarize_file` | Structural file overview |
|
|
298
|
+
|
|
299
|
+
### Web UI Features
|
|
300
|
+
|
|
301
|
+
- **Live token streaming** — responses appear word-by-word as the model generates
|
|
302
|
+
- **Tool call visibility** — see which tools the model invokes and their results in real-time
|
|
303
|
+
- **Multi-turn conversation** — full chat history maintained across turns
|
|
304
|
+
- **Markdown rendering** — code blocks with syntax highlighting (via highlight.js)
|
|
305
|
+
- **Dark/light theme** — toggle between themes with the header button
|
|
306
|
+
- **Repo selector** — specify the repository path to query against
|
|
307
|
+
- **Keyboard shortcuts** — Enter to send, Shift+Enter for newline
|
|
308
|
+
|
|
309
|
+
### Configuration
|
|
310
|
+
|
|
311
|
+
| Setting | CLI Flag | Env Var | Default | Description |
|
|
312
|
+
| :------ | :------- | :------ | :------ | :---------- |
|
|
313
|
+
| Enable chat | `--chat` | `CIMCP_CHAT=true` | off | Activate chat mode |
|
|
314
|
+
| Chat port | `--chat-port PORT` | `CIMCP_CHAT_PORT=PORT` | `3334` | HTTP port for the chat UI |
|
|
315
|
+
|
|
316
|
+
**Priority:** CLI flags > Environment variables > Defaults
|
|
317
|
+
|
|
318
|
+
### API Reference
|
|
319
|
+
|
|
320
|
+
The chat server exposes three HTTP endpoints:
|
|
321
|
+
|
|
322
|
+
**`GET /`** — Serves the web UI (single-page HTML with embedded CSS/JS).
|
|
323
|
+
|
|
324
|
+
**`GET /api/status`** — Returns model loading status.
|
|
325
|
+
```json
|
|
326
|
+
{"model_loaded": true, "model_name": "Qwen2.5-Coder-14B-Instruct"}
|
|
327
|
+
```
|
|
328
|
+
|
|
329
|
+
**`POST /api/chat`** — Starts a streaming chat session. Returns an SSE event stream.
|
|
330
|
+
|
|
331
|
+
Request body:
|
|
332
|
+
```json
|
|
333
|
+
{
|
|
334
|
+
"messages": [
|
|
335
|
+
{"role": "user", "content": "How does the ranking system work?"}
|
|
336
|
+
],
|
|
337
|
+
"repo_path": "/absolute/path/to/your/repo"
|
|
338
|
+
}
|
|
339
|
+
```
|
|
340
|
+
|
|
341
|
+
SSE event types:
|
|
342
|
+
|
|
343
|
+
| Event | Data | Description |
|
|
344
|
+
| :---- | :--- | :---------- |
|
|
345
|
+
| `token` | `{"type":"token","content":"The "}` | A generated text token |
|
|
346
|
+
| `tool_call` | `{"type":"tool_call","tool":"search_code","args":{...}}` | Tool invocation started |
|
|
347
|
+
| `tool_result` | `{"type":"tool_result","tool":"search_code","summary":"..."}` | Tool execution completed |
|
|
348
|
+
| `error` | `{"type":"error","message":"..."}` | Non-recoverable error |
|
|
349
|
+
| `done` | `{"type":"done"}` | Stream complete |
|
|
350
|
+
|
|
351
|
+
### Model Details
|
|
352
|
+
|
|
353
|
+
| Property | Value |
|
|
354
|
+
| :------- | :---- |
|
|
355
|
+
| Model | Qwen2.5-Coder-14B-Instruct |
|
|
356
|
+
| Format | GGUF Q4_K_M (~9 GB) |
|
|
357
|
+
| Context window | 8,192 tokens |
|
|
358
|
+
| Max generation | 2,048 tokens per response |
|
|
359
|
+
| GPU offloading | All layers via Metal |
|
|
360
|
+
| Sampling | Temperature 0.7 |
|
|
361
|
+
| HuggingFace repo | `Qwen/Qwen2.5-Coder-14B-Instruct-GGUF` |
|
|
362
|
+
| Cache location | `~/.code-intelligence/models/qwen2.5-coder-14b-gguf/` |
|
|
363
|
+
|
|
364
|
+
### Limitations
|
|
365
|
+
|
|
366
|
+
- **Standalone-only** — chat is not available in embedded (stdio) mode since it requires a persistent HTTP server
|
|
367
|
+
- **Apple Silicon required** — the 14B model needs Metal GPU acceleration; 16GB+ unified memory recommended
|
|
368
|
+
- **Context budget** — the 8K token context window is shared between conversation history, tool definitions, and tool results; long conversations may lose early context
|
|
369
|
+
- **Tool result truncation** — individual tool results are capped at 4,000 characters to preserve context budget
|
|
370
|
+
- **No authentication** — the chat server binds to localhost only; do not expose to the network without adding an auth layer
|
|
371
|
+
- **Single-threaded generation** — one chat request is processed at a time; concurrent requests queue
|
|
372
|
+
|
|
373
|
+
---
|
|
374
|
+
|
|
224
375
|
## Capabilities
|
|
225
376
|
|
|
226
377
|
Available tools for the agent (23 tools total):
|
|
@@ -423,11 +574,19 @@ Works without configuration by default. You can customize behavior via environme
|
|
|
423
574
|
```mermaid
|
|
424
575
|
flowchart LR
|
|
425
576
|
Client[MCP Client] <==> Tools
|
|
577
|
+
Browser[Chat Web UI] <==> ChatServer
|
|
426
578
|
|
|
427
579
|
subgraph Server [Code Intelligence Server]
|
|
428
580
|
direction TB
|
|
429
581
|
Tools[Tool Router]
|
|
430
582
|
|
|
583
|
+
subgraph Chat [Chat Mode]
|
|
584
|
+
direction TB
|
|
585
|
+
ChatServer[Axum HTTP + SSE] --> Agent[Agent Loop]
|
|
586
|
+
Agent --> ChatLLM["Qwen2.5-Coder-14B<br/>(Metal GPU)"]
|
|
587
|
+
Agent -- "tool calls" --> Handlers
|
|
588
|
+
end
|
|
589
|
+
|
|
431
590
|
subgraph Indexer [Indexing Pipeline]
|
|
432
591
|
direction TB
|
|
433
592
|
Watch[OS-Native File Watcher] --> Scan[File Scan]
|
|
@@ -455,19 +614,20 @@ flowchart LR
|
|
|
455
614
|
Context[Token-Aware Assembly]
|
|
456
615
|
end
|
|
457
616
|
|
|
458
|
-
|
|
459
|
-
Tools
|
|
617
|
+
Handlers[Tool Handlers]
|
|
618
|
+
Tools --> Handlers
|
|
619
|
+
Handlers -- Index --> Watch
|
|
460
620
|
PageRank --> SQLite
|
|
461
621
|
Embed --> Lance
|
|
462
622
|
Embed --> Cache
|
|
463
623
|
LLMDesc --> SQLite
|
|
464
624
|
JSDoc --> SQLite
|
|
465
625
|
|
|
466
|
-
|
|
626
|
+
Handlers -- Query --> QueryExpand
|
|
467
627
|
QueryExpand --> Hybrid
|
|
468
628
|
Hybrid --> Signals
|
|
469
629
|
Signals --> Context
|
|
470
|
-
Context -->
|
|
630
|
+
Context --> Handlers
|
|
471
631
|
end
|
|
472
632
|
```
|
|
473
633
|
|
|
@@ -492,6 +652,12 @@ EMBEDDINGS_BACKEND=hash BASE_DIR=/path/to/repo ./target/release/code-intelligenc
|
|
|
492
652
|
|
|
493
653
|
```text
|
|
494
654
|
src/
|
|
655
|
+
├── chat/ # Chat mode (--chat flag, standalone only)
|
|
656
|
+
│ ├── mod.rs # Axum HTTP server, SSE streaming, routes
|
|
657
|
+
│ ├── agent.rs # Multi-round agent loop, prompt building, tool call parsing
|
|
658
|
+
│ ├── llm.rs # ChatLlm (Qwen2.5-Coder-14B via llama.cpp, Metal GPU)
|
|
659
|
+
│ ├── tools.rs # Tool definitions (JSON) + dispatch to handlers
|
|
660
|
+
│ └── ui.html # Single-file web UI (vanilla JS, marked.js, highlight.js)
|
|
495
661
|
├── indexer/
|
|
496
662
|
│ ├── extract/ # Language-specific symbol extractors (Rust, TS, Python, Go, Java, C, C++)
|
|
497
663
|
│ ├── pipeline/ # Indexing pipeline stages (scan, parse, embed, watch, describe)
|
|
@@ -508,13 +674,13 @@ src/
|
|
|
508
674
|
│ ├── hybrid.rs # Hybrid BM25 + vector scoring loop
|
|
509
675
|
│ └── postprocess.rs # Final enforcement, vector promotion
|
|
510
676
|
├── graph/ # PageRank, call hierarchy, type graphs
|
|
511
|
-
├── handlers/ # MCP tool handlers
|
|
677
|
+
├── handlers/ # MCP tool handlers (shared by MCP server + chat agent)
|
|
512
678
|
├── server/ # MCP protocol routing (embedded + standalone)
|
|
513
679
|
│ ├── mod.rs # Shared tool dispatch, embedded handler
|
|
514
680
|
│ └── standalone.rs # Standalone HTTP handler with session routing
|
|
515
681
|
├── tools/ # Tool definitions (23 MCP tools)
|
|
516
682
|
├── embeddings/ # jina-code-0.5b embedding model (GGUF via llama.cpp)
|
|
517
|
-
├── llm/ # On-device LLM (Qwen2.5-Coder-1.5B via llama.cpp)
|
|
683
|
+
├── llm/ # On-device LLM (Qwen2.5-Coder-1.5B via llama.cpp, for descriptions)
|
|
518
684
|
├── reranker/ # Reranker trait and cache (currently disabled)
|
|
519
685
|
├── path/ # Cross-platform path normalization (camino)
|
|
520
686
|
├── text.rs # Text processing (synonym expansion, morphological variants)
|
package/package.json
CHANGED
|
@@ -1,10 +1,9 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@iceinvein/code-intelligence-mcp",
|
|
3
|
-
"version": "1.
|
|
3
|
+
"version": "1.5.1",
|
|
4
4
|
"description": "Code Intelligence MCP Server - Smart context for your LLM coding agent",
|
|
5
5
|
"bin": {
|
|
6
|
-
"code-intelligence-mcp": "bin/run.js"
|
|
7
|
-
"code-intelligence-mcp-standalone": "bin/standalone.js"
|
|
6
|
+
"code-intelligence-mcp": "bin/run.js"
|
|
8
7
|
},
|
|
9
8
|
"scripts": {
|
|
10
9
|
"postinstall": "node install.js"
|
package/bin/standalone.js
DELETED
|
@@ -1,26 +0,0 @@
|
|
|
1
|
-
#!/usr/bin/env node
|
|
2
|
-
|
|
3
|
-
const { spawn } = require('node:child_process');
|
|
4
|
-
const path = require('node:path');
|
|
5
|
-
const os = require('node:os');
|
|
6
|
-
const fs = require('node:fs');
|
|
7
|
-
|
|
8
|
-
const BINARY_NAME = 'code-intelligence-mcp-server';
|
|
9
|
-
const BINARY_PATH = path.join(__dirname, BINARY_NAME);
|
|
10
|
-
|
|
11
|
-
if (!fs.existsSync(BINARY_PATH)) {
|
|
12
|
-
console.error(`Binary not found at ${BINARY_PATH}`);
|
|
13
|
-
console.error('Please try reinstalling the package: npm install -g @iceinvein/code-intelligence-mcp');
|
|
14
|
-
process.exit(1);
|
|
15
|
-
}
|
|
16
|
-
|
|
17
|
-
// Pass through all args, prepend --standalone
|
|
18
|
-
const args = ['--standalone', ...process.argv.slice(2)];
|
|
19
|
-
|
|
20
|
-
const child = spawn(BINARY_PATH, args, {
|
|
21
|
-
stdio: 'inherit'
|
|
22
|
-
});
|
|
23
|
-
|
|
24
|
-
child.on('exit', (code) => process.exit(code));
|
|
25
|
-
process.on('SIGINT', () => child.kill('SIGINT'));
|
|
26
|
-
process.on('SIGTERM', () => child.kill('SIGTERM'));
|