kiri-mcp-server 0.5.0 → 0.7.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +59 -5
- package/config/default.example.yml +9 -0
- package/config/scoring-profiles.yml +21 -6
- package/dist/config/default.example.yml +9 -0
- package/dist/config/scoring-profiles.yml +21 -6
- package/dist/package.json +1 -1
- package/dist/server/context.js +0 -1
- package/dist/server/handlers.js +547 -79
- package/dist/server/scoring.js +8 -3
- package/dist/shared/duckdb.js +0 -2
- package/dist/shared/embedding.js +15 -2
- package/dist/shared/tokenizer.js +0 -1
- package/dist/shared/utils/simpleYaml.js +0 -1
- package/dist/src/server/handlers.d.ts.map +1 -1
- package/dist/src/server/handlers.js +353 -85
- package/dist/src/server/handlers.js.map +1 -1
- package/dist/src/server/rpc.d.ts.map +1 -1
- package/dist/src/server/rpc.js +9 -3
- package/dist/src/server/rpc.js.map +1 -1
- package/dist/src/server/scoring.d.ts +6 -0
- package/dist/src/server/scoring.d.ts.map +1 -1
- package/dist/src/server/scoring.js +29 -5
- package/dist/src/server/scoring.js.map +1 -1
- package/dist/src/shared/duckdb.d.ts +1 -0
- package/dist/src/shared/duckdb.d.ts.map +1 -1
- package/dist/src/shared/duckdb.js +54 -3
- package/dist/src/shared/duckdb.js.map +1 -1
- package/dist/src/shared/embedding.d.ts.map +1 -1
- package/dist/src/shared/embedding.js +2 -8
- package/dist/src/shared/embedding.js.map +1 -1
- package/dist/src/shared/tokenizer.d.ts +18 -0
- package/dist/src/shared/tokenizer.d.ts.map +1 -1
- package/dist/src/shared/tokenizer.js +35 -0
- package/dist/src/shared/tokenizer.js.map +1 -1
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
|
|
3
3
|
> Intelligent code context extraction for LLMs via Model Context Protocol
|
|
4
4
|
|
|
5
|
-
[](package.json)
|
|
6
6
|
[](LICENSE)
|
|
7
7
|
[](https://www.typescriptlang.org/)
|
|
8
8
|
[](https://modelcontextprotocol.io/)
|
|
@@ -12,11 +12,12 @@
|
|
|
12
12
|
## 🎯 Why KIRI?
|
|
13
13
|
|
|
14
14
|
- **🔌 MCP Native**: Plug-and-play integration with Claude Desktop, Codex CLI, and other MCP clients
|
|
15
|
-
- **🧠 Smart Context**: Extract minimal, relevant code fragments based on task goals
|
|
15
|
+
- **🧠 Smart Context**: Extract minimal, relevant code fragments based on task goals (95% accuracy)
|
|
16
16
|
- **⚡ Fast**: Sub-second response time for most queries
|
|
17
17
|
- **🔍 Semantic Search**: Multi-word queries, dependency analysis, and BM25 ranking
|
|
18
18
|
- **👁️ Auto-Sync**: Watch mode automatically re-indexes when files change
|
|
19
19
|
- **🛡️ Reliable**: Degrade-first architecture works without optional extensions
|
|
20
|
+
- **📝 Phrase-Aware**: Recognizes compound terms (kebab-case, snake_case) for precise matching
|
|
20
21
|
|
|
21
22
|
## ⚙️ Prerequisites
|
|
22
23
|
|
|
@@ -152,9 +153,16 @@ KIRI provides 5 MCP tools for intelligent code exploration:
|
|
|
152
153
|
|
|
153
154
|
### 1. context_bundle
|
|
154
155
|
|
|
155
|
-
**Extract relevant code context based on task goals**
|
|
156
|
+
**Extract relevant code context based on task goals (95% accuracy)**
|
|
156
157
|
|
|
157
|
-
The most powerful tool for getting started with unfamiliar code. Provide a task description, and KIRI returns the most relevant code snippets.
|
|
158
|
+
The most powerful tool for getting started with unfamiliar code. Provide a task description, and KIRI returns the most relevant code snippets using phrase-aware tokenization and path-based scoring.
|
|
159
|
+
|
|
160
|
+
**v0.7.0 improvements:**
|
|
161
|
+
|
|
162
|
+
- **Multiplicative penalties**: Documentation files now penalized by ×0.3 (70% reduction) instead of additive -2.0
|
|
163
|
+
- **Implementation prioritization**: Implementation files rank 5-10× higher than documentation (1.82 vs 0.30)
|
|
164
|
+
- **Unified boosting logic**: Consistent file ranking across `files_search` and `context_bundle`
|
|
165
|
+
- **Configurable profiles**: `boost_profile` parameter supports "default" (implementation-first), "docs" (documentation-first), or "none" (natural BM25)
|
|
158
166
|
|
|
159
167
|
**When to use:**
|
|
160
168
|
|
|
@@ -405,11 +413,57 @@ kiri --repo . --db .kiri/index.duckdb --watch --debounce 1000
|
|
|
405
413
|
**Watch Mode Features:**
|
|
406
414
|
|
|
407
415
|
- **Debouncing**: Aggregates rapid changes to minimize reindex operations
|
|
416
|
+
- **Incremental Indexing**: Only reindexes changed files (10-100x faster)
|
|
408
417
|
- **Background Operation**: Doesn't interrupt ongoing queries
|
|
409
418
|
- **Denylist Integration**: Respects `.gitignore` and `denylist.yml`
|
|
410
419
|
- **Lock Management**: Prevents concurrent indexing
|
|
411
420
|
- **Statistics**: Tracks reindex count, duration, and queue depth
|
|
412
421
|
|
|
422
|
+
### Tokenization Strategy
|
|
423
|
+
|
|
424
|
+
Control how KIRI tokenizes and matches compound terms using the `KIRI_TOKENIZATION_STRATEGY` environment variable:
|
|
425
|
+
|
|
426
|
+
```bash
|
|
427
|
+
# Phrase-aware (default): Recognizes kebab-case/snake_case as phrases
|
|
428
|
+
export KIRI_TOKENIZATION_STRATEGY=phrase-aware
|
|
429
|
+
|
|
430
|
+
# Legacy: Traditional word-by-word tokenization
|
|
431
|
+
export KIRI_TOKENIZATION_STRATEGY=legacy
|
|
432
|
+
|
|
433
|
+
# Hybrid: Both phrase and word-level matching
|
|
434
|
+
export KIRI_TOKENIZATION_STRATEGY=hybrid
|
|
435
|
+
```
|
|
436
|
+
|
|
437
|
+
**Strategies:**
|
|
438
|
+
|
|
439
|
+
- **`phrase-aware`** (default): Compound terms like `page-agent`, `user_profile` are treated as single phrases with 2× scoring weight. Best for codebases with consistent naming conventions.
|
|
440
|
+
- **`legacy`**: Traditional tokenization that splits all delimiters. Use for backward compatibility.
|
|
441
|
+
- **`hybrid`**: Combines both strategies for maximum flexibility.
|
|
442
|
+
|
|
443
|
+
### Database Auto-Gitignore
|
|
444
|
+
|
|
445
|
+
KIRI automatically creates `.gitignore` files in database directories to prevent accidental commits:
|
|
446
|
+
|
|
447
|
+
```typescript
|
|
448
|
+
// Enabled by default
|
|
449
|
+
const db = await DuckDBClient.connect({
|
|
450
|
+
databasePath: ".kiri/index.duckdb",
|
|
451
|
+
autoGitignore: true, // Creates .gitignore with "*" pattern
|
|
452
|
+
});
|
|
453
|
+
|
|
454
|
+
// Disable if needed
|
|
455
|
+
const db = await DuckDBClient.connect({
|
|
456
|
+
databasePath: ".kiri/index.duckdb",
|
|
457
|
+
autoGitignore: false,
|
|
458
|
+
});
|
|
459
|
+
```
|
|
460
|
+
|
|
461
|
+
**Behavior:**
|
|
462
|
+
|
|
463
|
+
- Only creates `.gitignore` if directory is inside a Git repository
|
|
464
|
+
- Never overwrites existing `.gitignore` files
|
|
465
|
+
- Uses wildcard pattern (`*`) to ignore all database files
|
|
466
|
+
|
|
413
467
|
### File Type Boosting
|
|
414
468
|
|
|
415
469
|
Control search ranking behavior with the `boost_profile` parameter:
|
|
@@ -680,6 +734,6 @@ Built with:
|
|
|
680
734
|
|
|
681
735
|
---
|
|
682
736
|
|
|
683
|
-
**Status**: v0.
|
|
737
|
+
**Status**: v0.7.0 (Beta) - Production-ready for MCP clients
|
|
684
738
|
|
|
685
739
|
For questions or support, please open a [GitHub issue](https://github.com/CAPHTECH/kiri/issues).
|
|
@@ -4,6 +4,15 @@ mcp:
|
|
|
4
4
|
tools:
|
|
5
5
|
- context_bundle
|
|
6
6
|
- files_search
|
|
7
|
+
|
|
8
|
+
# Tokenization configuration for keyword extraction
|
|
9
|
+
tokenization:
|
|
10
|
+
# Strategy: "phrase-aware" (default), "legacy", or "hybrid"
|
|
11
|
+
# - phrase-aware: Preserves hyphenated terms (e.g., "page-agent" stays as one unit)
|
|
12
|
+
# - legacy: Splits on hyphens (e.g., "page-agent" → ["page", "agent"])
|
|
13
|
+
# - hybrid: Emits both phrases and split keywords
|
|
14
|
+
strategy: "phrase-aware"
|
|
15
|
+
|
|
7
16
|
indexer:
|
|
8
17
|
repoRoot: "../../target-repo"
|
|
9
18
|
database: "var/index.duckdb"
|
|
@@ -2,36 +2,51 @@
|
|
|
2
2
|
# Each profile defines weights for different ranking signals
|
|
3
3
|
|
|
4
4
|
default:
|
|
5
|
-
textMatch: 0
|
|
5
|
+
textMatch: 1.0 # Text/keyword match weight (increased to prioritize literal matches)
|
|
6
|
+
pathMatch: 1.5 # Path-based match weight (new - prioritizes files with keywords in paths)
|
|
6
7
|
editingPath: 2.0 # Currently editing file weight
|
|
7
8
|
dependency: 0.6 # Dependency relationship weight (increased to prioritize connected implementation files)
|
|
8
9
|
proximity: 0.25 # Same directory weight
|
|
9
|
-
structural:
|
|
10
|
+
structural: 0.6 # Structural similarity weight (reduced to prevent false positives from similar structure)
|
|
11
|
+
docPenaltyMultiplier: 0.3 # Multiplicative penalty for docs (0.3 = 70% reduction, Phase 1 conservative value)
|
|
12
|
+
implBoostMultiplier: 1.3 # Multiplicative boost for implementation files (1.3 = 30% boost)
|
|
10
13
|
|
|
11
14
|
bugfix:
|
|
12
15
|
textMatch: 1.0
|
|
16
|
+
pathMatch: 1.5
|
|
13
17
|
editingPath: 1.8
|
|
14
18
|
dependency: 0.7 # Higher: bugs often in dependencies
|
|
15
19
|
proximity: 0.35
|
|
16
|
-
structural: 0.
|
|
20
|
+
structural: 0.7 # Reduced: prevent canvas-agent matching when searching page-agent
|
|
21
|
+
docPenaltyMultiplier: 0.3
|
|
22
|
+
implBoostMultiplier: 1.3
|
|
17
23
|
|
|
18
24
|
testfail:
|
|
19
25
|
textMatch: 1.0
|
|
26
|
+
pathMatch: 1.5
|
|
20
27
|
editingPath: 1.6
|
|
21
28
|
dependency: 0.85 # Very high: failed tests reveal dependencies
|
|
22
29
|
proximity: 0.3
|
|
23
|
-
structural: 0.
|
|
30
|
+
structural: 0.7 # Reduced: focus on actual test dependencies
|
|
31
|
+
docPenaltyMultiplier: 0.3
|
|
32
|
+
implBoostMultiplier: 1.3
|
|
24
33
|
|
|
25
34
|
typeerror:
|
|
26
35
|
textMatch: 1.0
|
|
36
|
+
pathMatch: 1.5
|
|
27
37
|
editingPath: 1.4
|
|
28
38
|
dependency: 0.6
|
|
29
39
|
proximity: 0.4 # Higher: type errors cluster in modules
|
|
30
|
-
structural: 0.6 #
|
|
40
|
+
structural: 0.6 # Already balanced for type analysis
|
|
41
|
+
docPenaltyMultiplier: 0.3
|
|
42
|
+
implBoostMultiplier: 1.3
|
|
31
43
|
|
|
32
44
|
feature:
|
|
33
45
|
textMatch: 1.0
|
|
46
|
+
pathMatch: 1.5
|
|
34
47
|
editingPath: 1.5
|
|
35
48
|
dependency: 0.45 # Lower: new features less dependent
|
|
36
49
|
proximity: 0.5 # Higher: features cluster spatially
|
|
37
|
-
structural: 0.
|
|
50
|
+
structural: 0.6 # Reduced: focus on actual feature files
|
|
51
|
+
docPenaltyMultiplier: 0.3
|
|
52
|
+
implBoostMultiplier: 1.3
|
|
@@ -4,6 +4,15 @@ mcp:
|
|
|
4
4
|
tools:
|
|
5
5
|
- context_bundle
|
|
6
6
|
- files_search
|
|
7
|
+
|
|
8
|
+
# Tokenization configuration for keyword extraction
|
|
9
|
+
tokenization:
|
|
10
|
+
# Strategy: "phrase-aware" (default), "legacy", or "hybrid"
|
|
11
|
+
# - phrase-aware: Preserves hyphenated terms (e.g., "page-agent" stays as one unit)
|
|
12
|
+
# - legacy: Splits on hyphens (e.g., "page-agent" → ["page", "agent"])
|
|
13
|
+
# - hybrid: Emits both phrases and split keywords
|
|
14
|
+
strategy: "phrase-aware"
|
|
15
|
+
|
|
7
16
|
indexer:
|
|
8
17
|
repoRoot: "../../target-repo"
|
|
9
18
|
database: "var/index.duckdb"
|
|
@@ -2,36 +2,51 @@
|
|
|
2
2
|
# Each profile defines weights for different ranking signals
|
|
3
3
|
|
|
4
4
|
default:
|
|
5
|
-
textMatch: 0
|
|
5
|
+
textMatch: 1.0 # Text/keyword match weight (increased to prioritize literal matches)
|
|
6
|
+
pathMatch: 1.5 # Path-based match weight (new - prioritizes files with keywords in paths)
|
|
6
7
|
editingPath: 2.0 # Currently editing file weight
|
|
7
8
|
dependency: 0.6 # Dependency relationship weight (increased to prioritize connected implementation files)
|
|
8
9
|
proximity: 0.25 # Same directory weight
|
|
9
|
-
structural:
|
|
10
|
+
structural: 0.6 # Structural similarity weight (reduced to prevent false positives from similar structure)
|
|
11
|
+
docPenaltyMultiplier: 0.3 # Multiplicative penalty for docs (0.3 = 70% reduction, Phase 1 conservative value)
|
|
12
|
+
implBoostMultiplier: 1.3 # Multiplicative boost for implementation files (1.3 = 30% boost)
|
|
10
13
|
|
|
11
14
|
bugfix:
|
|
12
15
|
textMatch: 1.0
|
|
16
|
+
pathMatch: 1.5
|
|
13
17
|
editingPath: 1.8
|
|
14
18
|
dependency: 0.7 # Higher: bugs often in dependencies
|
|
15
19
|
proximity: 0.35
|
|
16
|
-
structural: 0.
|
|
20
|
+
structural: 0.7 # Reduced: prevent canvas-agent matching when searching page-agent
|
|
21
|
+
docPenaltyMultiplier: 0.3
|
|
22
|
+
implBoostMultiplier: 1.3
|
|
17
23
|
|
|
18
24
|
testfail:
|
|
19
25
|
textMatch: 1.0
|
|
26
|
+
pathMatch: 1.5
|
|
20
27
|
editingPath: 1.6
|
|
21
28
|
dependency: 0.85 # Very high: failed tests reveal dependencies
|
|
22
29
|
proximity: 0.3
|
|
23
|
-
structural: 0.
|
|
30
|
+
structural: 0.7 # Reduced: focus on actual test dependencies
|
|
31
|
+
docPenaltyMultiplier: 0.3
|
|
32
|
+
implBoostMultiplier: 1.3
|
|
24
33
|
|
|
25
34
|
typeerror:
|
|
26
35
|
textMatch: 1.0
|
|
36
|
+
pathMatch: 1.5
|
|
27
37
|
editingPath: 1.4
|
|
28
38
|
dependency: 0.6
|
|
29
39
|
proximity: 0.4 # Higher: type errors cluster in modules
|
|
30
|
-
structural: 0.6 #
|
|
40
|
+
structural: 0.6 # Already balanced for type analysis
|
|
41
|
+
docPenaltyMultiplier: 0.3
|
|
42
|
+
implBoostMultiplier: 1.3
|
|
31
43
|
|
|
32
44
|
feature:
|
|
33
45
|
textMatch: 1.0
|
|
46
|
+
pathMatch: 1.5
|
|
34
47
|
editingPath: 1.5
|
|
35
48
|
dependency: 0.45 # Lower: new features less dependent
|
|
36
49
|
proximity: 0.5 # Higher: features cluster spatially
|
|
37
|
-
structural: 0.
|
|
50
|
+
structural: 0.6 # Reduced: focus on actual feature files
|
|
51
|
+
docPenaltyMultiplier: 0.3
|
|
52
|
+
implBoostMultiplier: 1.3
|
package/dist/package.json
CHANGED
package/dist/server/context.js
CHANGED