kiri-mcp-server 0.5.0 → 0.6.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +58 -5
- package/config/default.example.yml +9 -0
- package/config/scoring-profiles.yml +11 -6
- package/dist/config/default.example.yml +9 -0
- package/dist/config/scoring-profiles.yml +11 -6
- package/dist/package.json +1 -1
- package/dist/server/context.js +0 -1
- package/dist/server/handlers.js +547 -79
- package/dist/server/scoring.js +8 -3
- package/dist/shared/duckdb.js +0 -2
- package/dist/shared/embedding.js +15 -2
- package/dist/shared/tokenizer.js +0 -1
- package/dist/shared/utils/simpleYaml.js +0 -1
- package/dist/src/server/handlers.d.ts.map +1 -1
- package/dist/src/server/handlers.js +234 -26
- package/dist/src/server/handlers.js.map +1 -1
- package/dist/src/server/rpc.d.ts.map +1 -1
- package/dist/src/server/rpc.js +9 -3
- package/dist/src/server/rpc.js.map +1 -1
- package/dist/src/server/scoring.d.ts +2 -0
- package/dist/src/server/scoring.d.ts.map +1 -1
- package/dist/src/server/scoring.js +13 -1
- package/dist/src/server/scoring.js.map +1 -1
- package/dist/src/shared/duckdb.d.ts +1 -0
- package/dist/src/shared/duckdb.d.ts.map +1 -1
- package/dist/src/shared/duckdb.js +54 -3
- package/dist/src/shared/duckdb.js.map +1 -1
- package/dist/src/shared/embedding.d.ts.map +1 -1
- package/dist/src/shared/embedding.js +2 -8
- package/dist/src/shared/embedding.js.map +1 -1
- package/dist/src/shared/tokenizer.d.ts +18 -0
- package/dist/src/shared/tokenizer.d.ts.map +1 -1
- package/dist/src/shared/tokenizer.js +35 -0
- package/dist/src/shared/tokenizer.js.map +1 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
|
|
3
3
|
> Intelligent code context extraction for LLMs via Model Context Protocol
|
|
4
4
|
|
|
5
|
-
[](package.json)
|
|
6
6
|
[](LICENSE)
|
|
7
7
|
[](https://www.typescriptlang.org/)
|
|
8
8
|
[](https://modelcontextprotocol.io/)
|
|
@@ -12,11 +12,12 @@
|
|
|
12
12
|
## 🎯 Why KIRI?
|
|
13
13
|
|
|
14
14
|
- **🔌 MCP Native**: Plug-and-play integration with Claude Desktop, Codex CLI, and other MCP clients
|
|
15
|
-
- **🧠 Smart Context**: Extract minimal, relevant code fragments based on task goals
|
|
15
|
+
- **🧠 Smart Context**: Extract minimal, relevant code fragments based on task goals (95% accuracy)
|
|
16
16
|
- **⚡ Fast**: Sub-second response time for most queries
|
|
17
17
|
- **🔍 Semantic Search**: Multi-word queries, dependency analysis, and BM25 ranking
|
|
18
18
|
- **👁️ Auto-Sync**: Watch mode automatically re-indexes when files change
|
|
19
19
|
- **🛡️ Reliable**: Degrade-first architecture works without optional extensions
|
|
20
|
+
- **📝 Phrase-Aware**: Recognizes compound terms (kebab-case, snake_case) for precise matching
|
|
20
21
|
|
|
21
22
|
## ⚙️ Prerequisites
|
|
22
23
|
|
|
@@ -152,9 +153,15 @@ KIRI provides 5 MCP tools for intelligent code exploration:
|
|
|
152
153
|
|
|
153
154
|
### 1. context_bundle
|
|
154
155
|
|
|
155
|
-
**Extract relevant code context based on task goals**
|
|
156
|
+
**Extract relevant code context based on task goals (95% accuracy)**
|
|
156
157
|
|
|
157
|
-
The most powerful tool for getting started with unfamiliar code. Provide a task description, and KIRI returns the most relevant code snippets.
|
|
158
|
+
The most powerful tool for getting started with unfamiliar code. Provide a task description, and KIRI returns the most relevant code snippets using phrase-aware tokenization and path-based scoring.
|
|
159
|
+
|
|
160
|
+
**v0.6.0 improvements:**
|
|
161
|
+
|
|
162
|
+
- **Phrase-aware tokenization**: Recognizes compound terms like `page-agent`, `user_profile` as single concepts (2× scoring weight)
|
|
163
|
+
- **Path-based scoring**: Additional boost when keywords/phrases appear in file paths
|
|
164
|
+
- **95% accuracy**: Improved from 65-75% through enhanced tokenization and scoring
|
|
158
165
|
|
|
159
166
|
**When to use:**
|
|
160
167
|
|
|
@@ -405,11 +412,57 @@ kiri --repo . --db .kiri/index.duckdb --watch --debounce 1000
|
|
|
405
412
|
**Watch Mode Features:**
|
|
406
413
|
|
|
407
414
|
- **Debouncing**: Aggregates rapid changes to minimize reindex operations
|
|
415
|
+
- **Incremental Indexing**: Only reindexes changed files (10-100x faster)
|
|
408
416
|
- **Background Operation**: Doesn't interrupt ongoing queries
|
|
409
417
|
- **Denylist Integration**: Respects `.gitignore` and `denylist.yml`
|
|
410
418
|
- **Lock Management**: Prevents concurrent indexing
|
|
411
419
|
- **Statistics**: Tracks reindex count, duration, and queue depth
|
|
412
420
|
|
|
421
|
+
### Tokenization Strategy
|
|
422
|
+
|
|
423
|
+
Control how KIRI tokenizes and matches compound terms using the `KIRI_TOKENIZATION_STRATEGY` environment variable:
|
|
424
|
+
|
|
425
|
+
```bash
|
|
426
|
+
# Phrase-aware (default): Recognizes kebab-case/snake_case as phrases
|
|
427
|
+
export KIRI_TOKENIZATION_STRATEGY=phrase-aware
|
|
428
|
+
|
|
429
|
+
# Legacy: Traditional word-by-word tokenization
|
|
430
|
+
export KIRI_TOKENIZATION_STRATEGY=legacy
|
|
431
|
+
|
|
432
|
+
# Hybrid: Both phrase and word-level matching
|
|
433
|
+
export KIRI_TOKENIZATION_STRATEGY=hybrid
|
|
434
|
+
```
|
|
435
|
+
|
|
436
|
+
**Strategies:**
|
|
437
|
+
|
|
438
|
+
- **`phrase-aware`** (default): Compound terms like `page-agent`, `user_profile` are treated as single phrases with 2× scoring weight. Best for codebases with consistent naming conventions.
|
|
439
|
+
- **`legacy`**: Traditional tokenization that splits all delimiters. Use for backward compatibility.
|
|
440
|
+
- **`hybrid`**: Combines both strategies for maximum flexibility.
|
|
441
|
+
|
|
442
|
+
### Database Auto-Gitignore
|
|
443
|
+
|
|
444
|
+
KIRI automatically creates `.gitignore` files in database directories to prevent accidental commits:
|
|
445
|
+
|
|
446
|
+
```typescript
|
|
447
|
+
// Enabled by default
|
|
448
|
+
const db = await DuckDBClient.connect({
|
|
449
|
+
databasePath: ".kiri/index.duckdb",
|
|
450
|
+
autoGitignore: true, // Creates .gitignore with "*" pattern
|
|
451
|
+
});
|
|
452
|
+
|
|
453
|
+
// Disable if needed
|
|
454
|
+
const db = await DuckDBClient.connect({
|
|
455
|
+
databasePath: ".kiri/index.duckdb",
|
|
456
|
+
autoGitignore: false,
|
|
457
|
+
});
|
|
458
|
+
```
|
|
459
|
+
|
|
460
|
+
**Behavior:**
|
|
461
|
+
|
|
462
|
+
- Only creates `.gitignore` if directory is inside a Git repository
|
|
463
|
+
- Never overwrites existing `.gitignore` files
|
|
464
|
+
- Uses wildcard pattern (`*`) to ignore all database files
|
|
465
|
+
|
|
413
466
|
### File Type Boosting
|
|
414
467
|
|
|
415
468
|
Control search ranking behavior with the `boost_profile` parameter:
|
|
@@ -680,6 +733,6 @@ Built with:
|
|
|
680
733
|
|
|
681
734
|
---
|
|
682
735
|
|
|
683
|
-
**Status**: v0.
|
|
736
|
+
**Status**: v0.6.0 (Beta) - Production-ready for MCP clients
|
|
684
737
|
|
|
685
738
|
For questions or support, please open a [GitHub issue](https://github.com/CAPHTECH/kiri/issues).
|
|
@@ -4,6 +4,15 @@ mcp:
|
|
|
4
4
|
tools:
|
|
5
5
|
- context_bundle
|
|
6
6
|
- files_search
|
|
7
|
+
|
|
8
|
+
# Tokenization configuration for keyword extraction
|
|
9
|
+
tokenization:
|
|
10
|
+
# Strategy: "phrase-aware" (default), "legacy", or "hybrid"
|
|
11
|
+
# - phrase-aware: Preserves hyphenated terms (e.g., "page-agent" stays as one unit)
|
|
12
|
+
# - legacy: Splits on hyphens (e.g., "page-agent" → ["page", "agent"])
|
|
13
|
+
# - hybrid: Emits both phrases and split keywords
|
|
14
|
+
strategy: "phrase-aware"
|
|
15
|
+
|
|
7
16
|
indexer:
|
|
8
17
|
repoRoot: "../../target-repo"
|
|
9
18
|
database: "var/index.duckdb"
|
|
@@ -2,36 +2,41 @@
|
|
|
2
2
|
# Each profile defines weights for different ranking signals
|
|
3
3
|
|
|
4
4
|
default:
|
|
5
|
-
textMatch: 0
|
|
5
|
+
textMatch: 1.0 # Text/keyword match weight (increased to prioritize literal matches)
|
|
6
|
+
pathMatch: 1.5 # Path-based match weight (new - prioritizes files with keywords in paths)
|
|
6
7
|
editingPath: 2.0 # Currently editing file weight
|
|
7
8
|
dependency: 0.6 # Dependency relationship weight (increased to prioritize connected implementation files)
|
|
8
9
|
proximity: 0.25 # Same directory weight
|
|
9
|
-
structural:
|
|
10
|
+
structural: 0.6 # Structural similarity weight (reduced to prevent false positives from similar structure)
|
|
10
11
|
|
|
11
12
|
bugfix:
|
|
12
13
|
textMatch: 1.0
|
|
14
|
+
pathMatch: 1.5
|
|
13
15
|
editingPath: 1.8
|
|
14
16
|
dependency: 0.7 # Higher: bugs often in dependencies
|
|
15
17
|
proximity: 0.35
|
|
16
|
-
structural: 0.
|
|
18
|
+
structural: 0.7 # Reduced: prevent canvas-agent matching when searching page-agent
|
|
17
19
|
|
|
18
20
|
testfail:
|
|
19
21
|
textMatch: 1.0
|
|
22
|
+
pathMatch: 1.5
|
|
20
23
|
editingPath: 1.6
|
|
21
24
|
dependency: 0.85 # Very high: failed tests reveal dependencies
|
|
22
25
|
proximity: 0.3
|
|
23
|
-
structural: 0.
|
|
26
|
+
structural: 0.7 # Reduced: focus on actual test dependencies
|
|
24
27
|
|
|
25
28
|
typeerror:
|
|
26
29
|
textMatch: 1.0
|
|
30
|
+
pathMatch: 1.5
|
|
27
31
|
editingPath: 1.4
|
|
28
32
|
dependency: 0.6
|
|
29
33
|
proximity: 0.4 # Higher: type errors cluster in modules
|
|
30
|
-
structural: 0.6 #
|
|
34
|
+
structural: 0.6 # Already balanced for type analysis
|
|
31
35
|
|
|
32
36
|
feature:
|
|
33
37
|
textMatch: 1.0
|
|
38
|
+
pathMatch: 1.5
|
|
34
39
|
editingPath: 1.5
|
|
35
40
|
dependency: 0.45 # Lower: new features less dependent
|
|
36
41
|
proximity: 0.5 # Higher: features cluster spatially
|
|
37
|
-
structural: 0.
|
|
42
|
+
structural: 0.6 # Reduced: focus on actual feature files
|
|
@@ -4,6 +4,15 @@ mcp:
|
|
|
4
4
|
tools:
|
|
5
5
|
- context_bundle
|
|
6
6
|
- files_search
|
|
7
|
+
|
|
8
|
+
# Tokenization configuration for keyword extraction
|
|
9
|
+
tokenization:
|
|
10
|
+
# Strategy: "phrase-aware" (default), "legacy", or "hybrid"
|
|
11
|
+
# - phrase-aware: Preserves hyphenated terms (e.g., "page-agent" stays as one unit)
|
|
12
|
+
# - legacy: Splits on hyphens (e.g., "page-agent" → ["page", "agent"])
|
|
13
|
+
# - hybrid: Emits both phrases and split keywords
|
|
14
|
+
strategy: "phrase-aware"
|
|
15
|
+
|
|
7
16
|
indexer:
|
|
8
17
|
repoRoot: "../../target-repo"
|
|
9
18
|
database: "var/index.duckdb"
|
|
@@ -2,36 +2,41 @@
|
|
|
2
2
|
# Each profile defines weights for different ranking signals
|
|
3
3
|
|
|
4
4
|
default:
|
|
5
|
-
textMatch: 0
|
|
5
|
+
textMatch: 1.0 # Text/keyword match weight (increased to prioritize literal matches)
|
|
6
|
+
pathMatch: 1.5 # Path-based match weight (new - prioritizes files with keywords in paths)
|
|
6
7
|
editingPath: 2.0 # Currently editing file weight
|
|
7
8
|
dependency: 0.6 # Dependency relationship weight (increased to prioritize connected implementation files)
|
|
8
9
|
proximity: 0.25 # Same directory weight
|
|
9
|
-
structural:
|
|
10
|
+
structural: 0.6 # Structural similarity weight (reduced to prevent false positives from similar structure)
|
|
10
11
|
|
|
11
12
|
bugfix:
|
|
12
13
|
textMatch: 1.0
|
|
14
|
+
pathMatch: 1.5
|
|
13
15
|
editingPath: 1.8
|
|
14
16
|
dependency: 0.7 # Higher: bugs often in dependencies
|
|
15
17
|
proximity: 0.35
|
|
16
|
-
structural: 0.
|
|
18
|
+
structural: 0.7 # Reduced: prevent canvas-agent matching when searching page-agent
|
|
17
19
|
|
|
18
20
|
testfail:
|
|
19
21
|
textMatch: 1.0
|
|
22
|
+
pathMatch: 1.5
|
|
20
23
|
editingPath: 1.6
|
|
21
24
|
dependency: 0.85 # Very high: failed tests reveal dependencies
|
|
22
25
|
proximity: 0.3
|
|
23
|
-
structural: 0.
|
|
26
|
+
structural: 0.7 # Reduced: focus on actual test dependencies
|
|
24
27
|
|
|
25
28
|
typeerror:
|
|
26
29
|
textMatch: 1.0
|
|
30
|
+
pathMatch: 1.5
|
|
27
31
|
editingPath: 1.4
|
|
28
32
|
dependency: 0.6
|
|
29
33
|
proximity: 0.4 # Higher: type errors cluster in modules
|
|
30
|
-
structural: 0.6 #
|
|
34
|
+
structural: 0.6 # Already balanced for type analysis
|
|
31
35
|
|
|
32
36
|
feature:
|
|
33
37
|
textMatch: 1.0
|
|
38
|
+
pathMatch: 1.5
|
|
34
39
|
editingPath: 1.5
|
|
35
40
|
dependency: 0.45 # Lower: new features less dependent
|
|
36
41
|
proximity: 0.5 # Higher: features cluster spatially
|
|
37
|
-
structural: 0.
|
|
42
|
+
structural: 0.6 # Reduced: focus on actual feature files
|
package/dist/package.json
CHANGED
package/dist/server/context.js
CHANGED