@mcampa/claude-context-core 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (92) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +285 -0
  3. package/dist/.tsbuildinfo +1 -0
  4. package/dist/context.d.ts +276 -0
  5. package/dist/context.d.ts.map +1 -0
  6. package/dist/context.js +1080 -0
  7. package/dist/context.js.map +1 -0
  8. package/dist/embedding/base-embedding.d.ts +51 -0
  9. package/dist/embedding/base-embedding.d.ts.map +1 -0
  10. package/dist/embedding/base-embedding.js +36 -0
  11. package/dist/embedding/base-embedding.js.map +1 -0
  12. package/dist/embedding/gemini-embedding.d.ts +53 -0
  13. package/dist/embedding/gemini-embedding.d.ts.map +1 -0
  14. package/dist/embedding/gemini-embedding.js +152 -0
  15. package/dist/embedding/gemini-embedding.js.map +1 -0
  16. package/dist/embedding/index.d.ts +6 -0
  17. package/dist/embedding/index.d.ts.map +1 -0
  18. package/dist/embedding/index.js +24 -0
  19. package/dist/embedding/index.js.map +1 -0
  20. package/dist/embedding/ollama-embedding.d.ts +55 -0
  21. package/dist/embedding/ollama-embedding.d.ts.map +1 -0
  22. package/dist/embedding/ollama-embedding.js +192 -0
  23. package/dist/embedding/ollama-embedding.js.map +1 -0
  24. package/dist/embedding/openai-embedding.d.ts +36 -0
  25. package/dist/embedding/openai-embedding.d.ts.map +1 -0
  26. package/dist/embedding/openai-embedding.js +159 -0
  27. package/dist/embedding/openai-embedding.js.map +1 -0
  28. package/dist/embedding/voyageai-embedding.d.ts +44 -0
  29. package/dist/embedding/voyageai-embedding.d.ts.map +1 -0
  30. package/dist/embedding/voyageai-embedding.js +227 -0
  31. package/dist/embedding/voyageai-embedding.js.map +1 -0
  32. package/dist/index.d.ts +8 -0
  33. package/dist/index.d.ts.map +1 -0
  34. package/dist/index.js +24 -0
  35. package/dist/index.js.map +1 -0
  36. package/dist/splitter/ast-splitter.d.ts +22 -0
  37. package/dist/splitter/ast-splitter.d.ts.map +1 -0
  38. package/dist/splitter/ast-splitter.js +234 -0
  39. package/dist/splitter/ast-splitter.js.map +1 -0
  40. package/dist/splitter/index.d.ts +41 -0
  41. package/dist/splitter/index.d.ts.map +1 -0
  42. package/dist/splitter/index.js +27 -0
  43. package/dist/splitter/index.js.map +1 -0
  44. package/dist/splitter/langchain-splitter.d.ts +13 -0
  45. package/dist/splitter/langchain-splitter.d.ts.map +1 -0
  46. package/dist/splitter/langchain-splitter.js +118 -0
  47. package/dist/splitter/langchain-splitter.js.map +1 -0
  48. package/dist/sync/merkle.d.ts +26 -0
  49. package/dist/sync/merkle.d.ts.map +1 -0
  50. package/dist/sync/merkle.js +112 -0
  51. package/dist/sync/merkle.js.map +1 -0
  52. package/dist/sync/synchronizer.d.ts +30 -0
  53. package/dist/sync/synchronizer.d.ts.map +1 -0
  54. package/dist/sync/synchronizer.js +339 -0
  55. package/dist/sync/synchronizer.js.map +1 -0
  56. package/dist/types.d.ts +14 -0
  57. package/dist/types.d.ts.map +1 -0
  58. package/dist/types.js +3 -0
  59. package/dist/types.js.map +1 -0
  60. package/dist/utils/env-manager.d.ts +19 -0
  61. package/dist/utils/env-manager.d.ts.map +1 -0
  62. package/dist/utils/env-manager.js +125 -0
  63. package/dist/utils/env-manager.js.map +1 -0
  64. package/dist/utils/git.d.ts +11 -0
  65. package/dist/utils/git.d.ts.map +1 -0
  66. package/dist/utils/git.js +46 -0
  67. package/dist/utils/git.js.map +1 -0
  68. package/dist/utils/index.d.ts +2 -0
  69. package/dist/utils/index.d.ts.map +1 -0
  70. package/dist/utils/index.js +7 -0
  71. package/dist/utils/index.js.map +1 -0
  72. package/dist/vectordb/index.d.ts +5 -0
  73. package/dist/vectordb/index.d.ts.map +1 -0
  74. package/dist/vectordb/index.js +14 -0
  75. package/dist/vectordb/index.js.map +1 -0
  76. package/dist/vectordb/milvus-restful-vectordb.d.ts +75 -0
  77. package/dist/vectordb/milvus-restful-vectordb.d.ts.map +1 -0
  78. package/dist/vectordb/milvus-restful-vectordb.js +703 -0
  79. package/dist/vectordb/milvus-restful-vectordb.js.map +1 -0
  80. package/dist/vectordb/milvus-vectordb.d.ts +60 -0
  81. package/dist/vectordb/milvus-vectordb.d.ts.map +1 -0
  82. package/dist/vectordb/milvus-vectordb.js +638 -0
  83. package/dist/vectordb/milvus-vectordb.js.map +1 -0
  84. package/dist/vectordb/types.d.ts +120 -0
  85. package/dist/vectordb/types.d.ts.map +1 -0
  86. package/dist/vectordb/types.js +9 -0
  87. package/dist/vectordb/types.js.map +1 -0
  88. package/dist/vectordb/zilliz-utils.d.ts +135 -0
  89. package/dist/vectordb/zilliz-utils.d.ts.map +1 -0
  90. package/dist/vectordb/zilliz-utils.js +192 -0
  91. package/dist/vectordb/zilliz-utils.js.map +1 -0
  92. package/package.json +58 -0
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Zilliz
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,285 @@
1
+ # @zilliz/claude-context-core
2
+ ![](../../assets/claude-context.png)
3
+
4
+ The core indexing engine for Claude Context - a powerful tool for semantic search and analysis of codebases using vector embeddings and AI.
5
+
6
+ [![npm version](https://img.shields.io/npm/v/@zilliz/claude-context-core.svg)](https://www.npmjs.com/package/@zilliz/claude-context-core)
7
+ [![npm downloads](https://img.shields.io/npm/dm/@zilliz/claude-context-core.svg)](https://www.npmjs.com/package/@zilliz/claude-context-core)
8
+
9
+ > 📖 **New to Claude Context?** Check out the [main project README](../../README.md) for an overview and quick start guide.
10
+
11
+ ## Installation
12
+
13
+ ```bash
14
+ npm install @zilliz/claude-context-core
15
+ ```
16
+
17
+ ### Prepare Environment Variables
18
+ #### OpenAI API key
19
+ See [OpenAI Documentation](https://platform.openai.com/docs/api-reference) for more details to get your API key.
20
+ ```bash
21
+ OPENAI_API_KEY=your-openai-api-key
22
+ ```
23
+
24
+ #### Zilliz Cloud configuration
25
+ Get a free Milvus vector database on Zilliz Cloud.
26
+
27
+ Claude Context needs a vector database. You can [sign up](https://cloud.zilliz.com/signup?utm_source=github&utm_medium=referral&utm_campaign=2507-codecontext-readme) on Zilliz Cloud to get a free Serverless cluster.
28
+
29
+ ![](../../assets/signup_and_create_cluster.jpeg)
30
+
31
+ After creating your cluster, open your Zilliz Cloud console and copy both the **public endpoint** and your **API key**.
32
+ These will be used as `your-zilliz-cloud-public-endpoint` and `your-zilliz-cloud-api-key` in the configuration examples.
33
+
34
+ ![Zilliz Cloud Dashboard](../../assets/zilliz_cloud_dashboard.jpeg)
35
+
36
+ Keep both values handy for the configuration steps below.
37
+
38
+ If you need help creating your free vector database or finding these values, see the [Zilliz Cloud documentation](https://docs.zilliz.com/docs/create-cluster) for detailed instructions.
39
+
40
+ ```bash
41
+ MILVUS_ADDRESS=your-zilliz-cloud-public-endpoint
42
+ MILVUS_TOKEN=your-zilliz-cloud-api-key
43
+ ```
44
+
45
+ > 💡 **Tip**: For easier configuration management across different usage scenarios, consider using [global environment variables](../../docs/getting-started/environment-variables.md).
46
+
47
+ ## Quick Start
48
+
49
+ ```typescript
50
+ import {
51
+ Context,
52
+ OpenAIEmbedding,
53
+ MilvusVectorDatabase
54
+ } from '@zilliz/claude-context-core';
55
+
56
+ // Initialize embedding provider
57
+ const embedding = new OpenAIEmbedding({
58
+ apiKey: process.env.OPENAI_API_KEY || 'your-openai-api-key',
59
+ model: 'text-embedding-3-small'
60
+ });
61
+
62
+ // Initialize vector database
63
+ const vectorDatabase = new MilvusVectorDatabase({
64
+ address: process.env.MILVUS_ADDRESS || 'localhost:19530',
65
+ token: process.env.MILVUS_TOKEN || ''
66
+ });
67
+
68
+ // Create context instance
69
+ const context = new Context({
70
+ embedding,
71
+ vectorDatabase
72
+ });
73
+
74
+ // Index a codebase
75
+ const stats = await context.indexCodebase('./my-project', (progress) => {
76
+ console.log(`${progress.phase} - ${progress.percentage}%`);
77
+ });
78
+
79
+ console.log(`Indexed ${stats.indexedFiles} files with ${stats.totalChunks} chunks`);
80
+
81
+ // Search the codebase
82
+ const results = await context.semanticSearch(
83
+ './my-project',
84
+ 'function that handles user authentication',
85
+ 5
86
+ );
87
+
88
+ results.forEach(result => {
89
+ console.log(`${result.relativePath}:${result.startLine}-${result.endLine}`);
90
+ console.log(`Score: ${result.score}`);
91
+ console.log(result.content);
92
+ });
93
+ ```
94
+
95
+ ## Features
96
+
97
+ - **Multi-language Support**: Index TypeScript, JavaScript, Python, Java, C++, and many other programming languages
98
+ - **Semantic Search**: Find code using natural language queries powered by AI embeddings
99
+ - **Flexible Architecture**: Pluggable embedding providers and vector databases
100
+ - **Smart Chunking**: Intelligent code splitting that preserves context and structure
101
+ - **Batch Processing**: Efficient processing of large codebases with progress tracking
102
+ - **Pattern Matching**: Built-in ignore patterns for common build artifacts and dependencies
103
+ - **Incremental File Synchronization**: Efficient change detection using Merkle trees to only re-index modified files
104
+
105
+ ## Embedding Providers
106
+
107
+ - **OpenAI Embeddings** (`text-embedding-3-small`, `text-embedding-3-large`, `text-embedding-ada-002`)
108
+ - **VoyageAI Embeddings** - High-quality embeddings optimized for code (`voyage-code-3`, `voyage-3.5`, etc.)
109
+ - **Gemini Embeddings** - Google's embedding models (`gemini-embedding-001`)
110
+ - **Ollama Embeddings** - Local embedding models via Ollama
111
+
112
+ ## Vector Database Support
113
+
114
+ - **Milvus/Zilliz Cloud** - High-performance vector database
115
+
116
+ ## Code Splitters
117
+
118
+ - **AST Code Splitter** - AST-based code splitting with automatic fallback (default)
119
+ - **LangChain Code Splitter** - Character-based code chunking
120
+
121
+ ## Configuration
122
+
123
+ ### ContextConfig
124
+
125
+ ```typescript
126
+ interface ContextConfig {
127
+ embedding?: Embedding; // Embedding provider
128
+ vectorDatabase?: VectorDatabase; // Vector database instance (required)
129
+ codeSplitter?: Splitter; // Code splitting strategy
130
+ supportedExtensions?: string[]; // File extensions to index
131
+ ignorePatterns?: string[]; // Patterns to ignore
132
+ customExtensions?: string[]; // Custom extensions from MCP
133
+ customIgnorePatterns?: string[]; // Custom ignore patterns from MCP
134
+ }
135
+ ```
136
+
137
+ ### Supported File Extensions (Default)
138
+
139
+ ```typescript
140
+ [
141
+ // Programming languages
142
+ '.ts', '.tsx', '.js', '.jsx', '.py', '.java', '.cpp', '.c', '.h', '.hpp',
143
+ '.cs', '.go', '.rs', '.php', '.rb', '.swift', '.kt', '.scala', '.m', '.mm',
144
+ // Text and markup files
145
+ '.md', '.markdown', '.ipynb'
146
+ ]
147
+ ```
148
+
149
+ ### Default Ignore Patterns
150
+
151
+ - Build and dependency directories: `node_modules/**`, `dist/**`, `build/**`, `out/**`, `target/**`
152
+ - Version control: `.git/**`, `.svn/**`, `.hg/**`
153
+ - IDE files: `.vscode/**`, `.idea/**`, `*.swp`, `*.swo`
154
+ - Cache directories: `.cache/**`, `__pycache__/**`, `.pytest_cache/**`, `coverage/**`
155
+ - Minified files: `*.min.js`, `*.min.css`, `*.bundle.js`, `*.map`
156
+ - Log and temp files: `logs/**`, `tmp/**`, `temp/**`, `*.log`
157
+ - Environment files: `.env`, `.env.*`, `*.local`
158
+
159
+ ## API Reference
160
+
161
+ ### Context
162
+
163
+ #### Methods
164
+
165
+ - `indexCodebase(path, progressCallback?, forceReindex?)` - Index an entire codebase
166
+ - `reindexByChange(path, progressCallback?)` - Incrementally re-index only changed files
167
+ - `semanticSearch(path, query, topK?, threshold?, filterExpr?)` - Search indexed code semantically
168
+ - `hasIndex(path)` - Check if codebase is already indexed
169
+ - `clearIndex(path, progressCallback?)` - Remove index for a codebase
170
+ - `updateIgnorePatterns(patterns)` - Update ignore patterns
171
+ - `addCustomIgnorePatterns(patterns)` - Add custom ignore patterns
172
+ - `addCustomExtensions(extensions)` - Add custom file extensions
173
+ - `updateEmbedding(embedding)` - Switch embedding provider
174
+ - `updateVectorDatabase(vectorDB)` - Switch vector database
175
+ - `updateSplitter(splitter)` - Switch code splitter
176
+
177
+ ### Search Results
178
+
179
+ ```typescript
180
+ interface SemanticSearchResult {
181
+ content: string; // Code content
182
+ relativePath: string; // File path relative to codebase root
183
+ startLine: number; // Starting line number
184
+ endLine: number; // Ending line number
185
+ language: string; // Programming language
186
+ score: number; // Similarity score (0-1)
187
+ }
188
+ ```
189
+
190
+
191
+ ## Examples
192
+
193
+ ### Using VoyageAI Embeddings
194
+
195
+ ```typescript
196
+ import { Context, MilvusVectorDatabase, VoyageAIEmbedding } from '@zilliz/claude-context-core';
197
+
198
+ // Initialize with VoyageAI embedding provider
199
+ const embedding = new VoyageAIEmbedding({
200
+ apiKey: process.env.VOYAGEAI_API_KEY || 'your-voyageai-api-key',
201
+ model: 'voyage-code-3'
202
+ });
203
+
204
+ const vectorDatabase = new MilvusVectorDatabase({
205
+ address: process.env.MILVUS_ADDRESS || 'localhost:19530',
206
+ token: process.env.MILVUS_TOKEN || ''
207
+ });
208
+
209
+ const context = new Context({
210
+ embedding,
211
+ vectorDatabase
212
+ });
213
+ ```
214
+
215
+ ### Custom File Filtering
216
+
217
+ ```typescript
218
+ const context = new Context({
219
+ embedding,
220
+ vectorDatabase,
221
+ supportedExtensions: ['.ts', '.js', '.py', '.java'],
222
+ ignorePatterns: [
223
+ 'node_modules/**',
224
+ 'dist/**',
225
+ '*.spec.ts',
226
+ '*.test.js'
227
+ ]
228
+ });
229
+ ```
230
+
231
+ ## File Synchronization Architecture
232
+
233
+ Claude Context implements an intelligent file synchronization system that efficiently tracks and processes only the files that have changed since the last indexing operation. This dramatically improves performance when working with large codebases.
234
+
235
+ ![File Synchronization Architecture](../../assets/file_synchronizer.png)
236
+
237
+ ### How It Works
238
+
239
+ The file synchronization system uses a **Merkle tree-based approach** combined with SHA-256 file hashing to detect changes:
240
+
241
+ #### 1. File Hashing
242
+ - Each file in the codebase is hashed using SHA-256
243
+ - File hashes are computed based on file content, not metadata
244
+ - Hashes are stored with relative file paths for consistency across different environments
245
+
246
+ #### 2. Merkle Tree Construction
247
+ - All file hashes are organized into a Merkle tree structure
248
+ - The tree provides a single root hash that represents the entire codebase state
249
+ - Any change to any file will cause the root hash to change
250
+
251
+ #### 3. Snapshot Management
252
+ - File synchronization state is persisted to `~/.context/merkle/` directory
253
+ - Each codebase gets a unique snapshot file based on its absolute path hash
254
+ - Snapshots contain both file hashes and serialized Merkle tree data
255
+
256
+ #### 4. Change Detection Process
257
+ 1. **Quick Check**: Compare current Merkle root hash with stored snapshot
258
+ 2. **Detailed Analysis**: If root hashes differ, perform file-by-file comparison
259
+ 3. **Change Classification**: Categorize changes into three types:
260
+ - **Added**: New files that didn't exist before
261
+ - **Modified**: Existing files with changed content
262
+ - **Removed**: Files that were deleted from the codebase
263
+
264
+ #### 5. Incremental Updates
265
+ - Only process files that have actually changed
266
+ - Update vector database entries only for modified chunks
267
+ - Remove entries for deleted files
268
+ - Add entries for new files
269
+
270
+
271
+ ## Contributing
272
+
273
+ This package is part of the Claude Context monorepo. Please see:
274
+ - [Main Contributing Guide](../../CONTRIBUTING.md) - General contribution guidelines
275
+ - [Core Package Contributing](CONTRIBUTING.md) - Specific development guide for this package
276
+
277
+ ## Related Packages
278
+
279
+ - **[@claude-context/mcp](../mcp)** - MCP server that uses this core engine
280
+ - **[VSCode Extension](../vscode-extension)** - VSCode extension built on this core
281
+
282
+
283
+ ## License
284
+
285
+ MIT - See [LICENSE](../../LICENSE) for details