@mcampa/ai-context-core 0.0.1-beta.05e8984

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (88) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +354 -0
  3. package/dist/.tsbuildinfo +1 -0
  4. package/dist/context.d.ts +276 -0
  5. package/dist/context.d.ts.map +1 -0
  6. package/dist/context.js +1177 -0
  7. package/dist/context.js.map +1 -0
  8. package/dist/embedding/base-embedding.d.ts +51 -0
  9. package/dist/embedding/base-embedding.d.ts.map +1 -0
  10. package/dist/embedding/base-embedding.js +36 -0
  11. package/dist/embedding/base-embedding.js.map +1 -0
  12. package/dist/embedding/gemini-embedding.d.ts +53 -0
  13. package/dist/embedding/gemini-embedding.d.ts.map +1 -0
  14. package/dist/embedding/gemini-embedding.js +154 -0
  15. package/dist/embedding/gemini-embedding.js.map +1 -0
  16. package/dist/embedding/index.d.ts +6 -0
  17. package/dist/embedding/index.d.ts.map +1 -0
  18. package/dist/embedding/index.js +24 -0
  19. package/dist/embedding/index.js.map +1 -0
  20. package/dist/embedding/ollama-embedding.d.ts +55 -0
  21. package/dist/embedding/ollama-embedding.d.ts.map +1 -0
  22. package/dist/embedding/ollama-embedding.js +193 -0
  23. package/dist/embedding/ollama-embedding.js.map +1 -0
  24. package/dist/embedding/openai-embedding.d.ts +36 -0
  25. package/dist/embedding/openai-embedding.d.ts.map +1 -0
  26. package/dist/embedding/openai-embedding.js +161 -0
  27. package/dist/embedding/openai-embedding.js.map +1 -0
  28. package/dist/embedding/voyageai-embedding.d.ts +44 -0
  29. package/dist/embedding/voyageai-embedding.d.ts.map +1 -0
  30. package/dist/embedding/voyageai-embedding.js +227 -0
  31. package/dist/embedding/voyageai-embedding.js.map +1 -0
  32. package/dist/index.d.ts +8 -0
  33. package/dist/index.d.ts.map +1 -0
  34. package/dist/index.js +24 -0
  35. package/dist/index.js.map +1 -0
  36. package/dist/splitter/ast-splitter.d.ts +22 -0
  37. package/dist/splitter/ast-splitter.d.ts.map +1 -0
  38. package/dist/splitter/ast-splitter.js +308 -0
  39. package/dist/splitter/ast-splitter.js.map +1 -0
  40. package/dist/splitter/index.d.ts +41 -0
  41. package/dist/splitter/index.d.ts.map +1 -0
  42. package/dist/splitter/index.js +27 -0
  43. package/dist/splitter/index.js.map +1 -0
  44. package/dist/splitter/langchain-splitter.d.ts +13 -0
  45. package/dist/splitter/langchain-splitter.d.ts.map +1 -0
  46. package/dist/splitter/langchain-splitter.js +118 -0
  47. package/dist/splitter/langchain-splitter.js.map +1 -0
  48. package/dist/sync/merkle.d.ts +30 -0
  49. package/dist/sync/merkle.d.ts.map +1 -0
  50. package/dist/sync/merkle.js +112 -0
  51. package/dist/sync/merkle.js.map +1 -0
  52. package/dist/sync/synchronizer.d.ts +30 -0
  53. package/dist/sync/synchronizer.d.ts.map +1 -0
  54. package/dist/sync/synchronizer.js +347 -0
  55. package/dist/sync/synchronizer.js.map +1 -0
  56. package/dist/types.d.ts +14 -0
  57. package/dist/types.d.ts.map +1 -0
  58. package/dist/types.js +3 -0
  59. package/dist/types.js.map +1 -0
  60. package/dist/utils/env-manager.d.ts +19 -0
  61. package/dist/utils/env-manager.d.ts.map +1 -0
  62. package/dist/utils/env-manager.js +125 -0
  63. package/dist/utils/env-manager.js.map +1 -0
  64. package/dist/utils/index.d.ts +2 -0
  65. package/dist/utils/index.d.ts.map +1 -0
  66. package/dist/utils/index.js +7 -0
  67. package/dist/utils/index.js.map +1 -0
  68. package/dist/vectordb/index.d.ts +5 -0
  69. package/dist/vectordb/index.d.ts.map +1 -0
  70. package/dist/vectordb/index.js +14 -0
  71. package/dist/vectordb/index.js.map +1 -0
  72. package/dist/vectordb/milvus-restful-vectordb.d.ts +75 -0
  73. package/dist/vectordb/milvus-restful-vectordb.d.ts.map +1 -0
  74. package/dist/vectordb/milvus-restful-vectordb.js +728 -0
  75. package/dist/vectordb/milvus-restful-vectordb.js.map +1 -0
  76. package/dist/vectordb/milvus-vectordb.d.ts +60 -0
  77. package/dist/vectordb/milvus-vectordb.d.ts.map +1 -0
  78. package/dist/vectordb/milvus-vectordb.js +662 -0
  79. package/dist/vectordb/milvus-vectordb.js.map +1 -0
  80. package/dist/vectordb/types.d.ts +120 -0
  81. package/dist/vectordb/types.d.ts.map +1 -0
  82. package/dist/vectordb/types.js +9 -0
  83. package/dist/vectordb/types.js.map +1 -0
  84. package/dist/vectordb/zilliz-utils.d.ts +135 -0
  85. package/dist/vectordb/zilliz-utils.d.ts.map +1 -0
  86. package/dist/vectordb/zilliz-utils.js +197 -0
  87. package/dist/vectordb/zilliz-utils.js.map +1 -0
  88. package/package.json +58 -0
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Zilliz
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,354 @@
1
+ # @mcampa/claude-context-core
2
+
3
+ ![](../../assets/claude-context.png)
4
+
5
+ The core indexing engine for Claude Context - a powerful tool for semantic search and analysis of codebases using vector embeddings and AI.
6
+
7
+ [![npm version](https://img.shields.io/npm/v/@mcampa/claude-context-core.svg)](https://www.npmjs.com/package/@mcampa/claude-context-core)
8
+ [![npm downloads](https://img.shields.io/npm/dm/@mcampa/claude-context-core.svg)](https://www.npmjs.com/package/@mcampa/claude-context-core)
9
+
10
+ > 📖 **New to Claude Context?** Check out the [main project README](../../README.md) for an overview and quick start guide.
11
+
12
+ ## Installation
13
+
14
+ ```bash
15
+ npm install @mcampa/claude-context-core
16
+ ```
17
+
18
+ ### Prepare Environment Variables
19
+
20
+ #### OpenAI API key
21
+
22
+ See [OpenAI Documentation](https://platform.openai.com/docs/api-reference) for more details to get your API key.
23
+
24
+ ```bash
25
+ OPENAI_API_KEY=your-openai-api-key
26
+ ```
27
+
28
+ #### Zilliz Cloud configuration
29
+
30
+ Get a free Milvus vector database on Zilliz Cloud.
31
+
32
+ Claude Context needs a vector database. You can [sign up](https://cloud.zilliz.com/signup?utm_source=github&utm_medium=referral&utm_campaign=2507-codecontext-readme) on Zilliz Cloud to get a free Serverless cluster.
33
+
34
+ ![](../../assets/signup_and_create_cluster.jpeg)
35
+
36
+ After creating your cluster, open your Zilliz Cloud console and copy both the **public endpoint** and your **API key**.
37
+ These will be used as `your-zilliz-cloud-public-endpoint` and `your-zilliz-cloud-api-key` in the configuration examples.
38
+
39
+ ![Zilliz Cloud Dashboard](../../assets/zilliz_cloud_dashboard.jpeg)
40
+
41
+ Keep both values handy for the configuration steps below.
42
+
43
+ If you need help creating your free vector database or finding these values, see the [Zilliz Cloud documentation](https://docs.zilliz.com/docs/create-cluster) for detailed instructions.
44
+
45
+ ```bash
46
+ MILVUS_ADDRESS=your-zilliz-cloud-public-endpoint
47
+ MILVUS_TOKEN=your-zilliz-cloud-api-key
48
+ ```
49
+
50
+ > 💡 **Tip**: For easier configuration management across different usage scenarios, consider using [global environment variables](../../docs/getting-started/environment-variables.md).
51
+
52
+ ## Quick Start
53
+
54
+ ```typescript
55
+ import {
56
+ Context,
57
+ OpenAIEmbedding,
58
+ MilvusVectorDatabase,
59
+ } from "@mcampa/claude-context-core";
60
+
61
+ // Initialize embedding provider
62
+ const embedding = new OpenAIEmbedding({
63
+ apiKey: process.env.OPENAI_API_KEY || "your-openai-api-key",
64
+ model: "text-embedding-3-small",
65
+ });
66
+
67
+ // Initialize vector database
68
+ const vectorDatabase = new MilvusVectorDatabase({
69
+ address: process.env.MILVUS_ADDRESS || "localhost:19530",
70
+ token: process.env.MILVUS_TOKEN || "",
71
+ });
72
+
73
+ // Create context instance
74
+ const context = new Context({
75
+ name: "my-context",
76
+ embedding,
77
+ vectorDatabase,
78
+ });
79
+
80
+ // Index a codebase
81
+ const stats = await context.indexCodebase("./my-project", (progress) => {
82
+ console.log(`${progress.phase} - ${progress.percentage}%`);
83
+ });
84
+
85
+ console.log(
86
+ `Indexed ${stats.indexedFiles} files with ${stats.totalChunks} chunks`,
87
+ );
88
+
89
+ // Search the codebase
90
+ const results = await context.semanticSearch(
91
+ "function that handles user authentication",
92
+ 5,
93
+ );
94
+
95
+ results.forEach((result) => {
96
+ console.log(`${result.relativePath}:${result.startLine}-${result.endLine}`);
97
+ console.log(`Score: ${result.score}`);
98
+ console.log(result.content);
99
+ });
100
+ ```
101
+
102
+ ## Features
103
+
104
+ - **Multi-language Support**: Index TypeScript, JavaScript, Python, Java, C++, and many other programming languages
105
+ - **Semantic Search**: Find code using natural language queries powered by AI embeddings
106
+ - **Flexible Architecture**: Pluggable embedding providers and vector databases
107
+ - **Smart Chunking**: Intelligent code splitting that preserves context and structure
108
+ - **Batch Processing**: Efficient processing of large codebases with progress tracking
109
+ - **Pattern Matching**: Built-in ignore patterns for common build artifacts and dependencies
110
+ - **Incremental File Synchronization**: Efficient change detection using Merkle trees to only re-index modified files
111
+
112
+ ## Embedding Providers
113
+
114
+ - **OpenAI Embeddings** (`text-embedding-3-small`, `text-embedding-3-large`, `text-embedding-ada-002`)
115
+ - **VoyageAI Embeddings** - High-quality embeddings optimized for code (`voyage-code-3`, `voyage-3.5`, etc.)
116
+ - **Gemini Embeddings** - Google's embedding models (`gemini-embedding-001`)
117
+ - **Ollama Embeddings** - Local embedding models via Ollama
118
+
119
+ ## Vector Database Support
120
+
121
+ - **Milvus/Zilliz Cloud** - High-performance vector database
122
+
123
+ ## Code Splitters
124
+
125
+ - **AST Code Splitter** - AST-based code splitting with automatic fallback (default)
126
+ - **LangChain Code Splitter** - Character-based code chunking
127
+
128
+ ## Configuration
129
+
130
+ ### ContextConfig
131
+
132
+ ```typescript
133
+ interface ContextConfig {
134
+ name?: string; // Context name (default: 'my-context')
135
+ embedding?: Embedding; // Embedding provider
136
+ vectorDatabase?: VectorDatabase; // Vector database instance (required)
137
+ codeSplitter?: Splitter; // Code splitting strategy
138
+ supportedExtensions?: string[]; // File extensions to index (replaces defaults if provided)
139
+ ignorePatterns?: string[]; // Patterns to ignore (replaces defaults if provided)
140
+ customExtensions?: string[]; // Additional extensions to add to supportedExtensions
141
+ customIgnorePatterns?: string[]; // Additional patterns to add to ignorePatterns
142
+ }
143
+ ```
144
+
145
+ **Note on configuration behavior:**
146
+
147
+ - If you provide `supportedExtensions`, it **replaces** the default extensions entirely
148
+ - If you provide `ignorePatterns`, it **replaces** the default ignore patterns entirely
149
+ - `customExtensions` and `customIgnorePatterns` are **added** to whatever base is used (defaults or your custom ones)
150
+
151
+ ### Supported File Extensions (Default)
152
+
153
+ ```typescript
154
+ [
155
+ // Programming languages
156
+ ".ts",
157
+ ".tsx",
158
+ ".js",
159
+ ".jsx",
160
+ ".py",
161
+ ".java",
162
+ ".cpp",
163
+ ".c",
164
+ ".h",
165
+ ".hpp",
166
+ ".cs",
167
+ ".go",
168
+ ".rs",
169
+ ".php",
170
+ ".rb",
171
+ ".swift",
172
+ ".kt",
173
+ ".scala",
174
+ ".m",
175
+ ".mm",
176
+ // Text and markup files
177
+ ".md",
178
+ ".markdown",
179
+ ".ipynb",
180
+ ];
181
+ ```
182
+
183
+ ### Default Ignore Patterns
184
+
185
+ - Build and dependency directories: `node_modules/**`, `dist/**`, `build/**`, `out/**`, `target/**`
186
+ - Version control: `.git/**`, `.svn/**`, `.hg/**`
187
+ - IDE files: `.vscode/**`, `.idea/**`, `*.swp`, `*.swo`
188
+ - Cache directories: `.cache/**`, `__pycache__/**`, `.pytest_cache/**`, `coverage/**`
189
+ - Minified files: `*.min.js`, `*.min.css`, `*.bundle.js`, `*.map`
190
+ - Log and temp files: `logs/**`, `tmp/**`, `temp/**`, `*.log`
191
+ - Environment files: `.env`, `.env.*`, `*.local`
192
+
193
+ ## API Reference
194
+
195
+ ### Context
196
+
197
+ #### Methods
198
+
199
+ - `indexCodebase(path, progressCallback?, forceReindex?)` - Index an entire codebase
200
+ - `reindexByChange(path, progressCallback?)` - Incrementally re-index only changed files
201
+ - `semanticSearch(query, topK?, threshold?, filterExpr?)` - Search indexed code semantically
202
+ - `hasIndex()` - Check if index exists
203
+ - `clearIndex(path, progressCallback?)` - Remove index for a codebase
204
+ - `updateIgnorePatterns(patterns)` - Update ignore patterns
205
+ - `addCustomIgnorePatterns(patterns)` - Add custom ignore patterns
206
+ - `addCustomExtensions(extensions)` - Add custom file extensions
207
+ - `updateEmbedding(embedding)` - Switch embedding provider
208
+ - `updateVectorDatabase(vectorDB)` - Switch vector database
209
+ - `updateSplitter(splitter)` - Switch code splitter
210
+
211
+ ### Search Results
212
+
213
+ ```typescript
214
+ interface SemanticSearchResult {
215
+ content: string; // Code content
216
+ relativePath: string; // File path relative to codebase root
217
+ startLine: number; // Starting line number
218
+ endLine: number; // Ending line number
219
+ language: string; // Programming language
220
+ score: number; // Similarity score (0-1)
221
+ }
222
+ ```
223
+
224
+ ## Examples
225
+
226
+ ### Using VoyageAI Embeddings
227
+
228
+ ```typescript
229
+ import {
230
+ Context,
231
+ MilvusVectorDatabase,
232
+ VoyageAIEmbedding,
233
+ } from "@mcampa/claude-context-core";
234
+
235
+ // Initialize with VoyageAI embedding provider
236
+ const embedding = new VoyageAIEmbedding({
237
+ apiKey: process.env.VOYAGEAI_API_KEY || "your-voyageai-api-key",
238
+ model: "voyage-code-3",
239
+ });
240
+
241
+ const vectorDatabase = new MilvusVectorDatabase({
242
+ address: process.env.MILVUS_ADDRESS || "localhost:19530",
243
+ token: process.env.MILVUS_TOKEN || "",
244
+ });
245
+
246
+ const context = new Context({
247
+ name: "my-context",
248
+ embedding,
249
+ vectorDatabase,
250
+ });
251
+ ```
252
+
253
+ ### Custom File Filtering
254
+
255
+ ```typescript
256
+ // Replace default extensions and ignore patterns entirely
257
+ const context = new Context({
258
+ name: "my-context",
259
+ embedding,
260
+ vectorDatabase,
261
+ supportedExtensions: [".ts", ".js", ".py", ".java"], // Only these extensions
262
+ ignorePatterns: ["node_modules/**", "dist/**", "*.spec.ts", "*.test.js"], // Only these patterns
263
+ });
264
+
265
+ // Or add to defaults using custom* properties
266
+ const context2 = new Context({
267
+ name: "my-context",
268
+ embedding,
269
+ vectorDatabase,
270
+ customExtensions: [".vue", ".svelte"], // Adds to default extensions
271
+ customIgnorePatterns: ["*.spec.ts"], // Adds to default ignore patterns
272
+ });
273
+ ```
274
+
275
+ ### Relative Path Indexing
276
+
277
+ File paths are indexed relative to the `codebasePath` parameter you provide when indexing:
278
+
279
+ ```typescript
280
+ // Index a codebase - paths will be relative to this directory
281
+ await context.indexCodebase("/Users/username/projects/my-workspace");
282
+
283
+ // Search returns results with relative paths
284
+ const results = await context.semanticSearch("user authentication function");
285
+
286
+ // Results show paths relative to the indexed codebase path, e.g.:
287
+ // "packages/app/src/auth.ts" instead of full absolute path
288
+ ```
289
+
290
+ This makes search results:
291
+
292
+ - More readable and concise
293
+ - Portable across different machines
294
+ - Consistent in monorepo environments
295
+
296
+ ## File Synchronization Architecture
297
+
298
+ Claude Context implements an intelligent file synchronization system that efficiently tracks and processes only the files that have changed since the last indexing operation. This dramatically improves performance when working with large codebases.
299
+
300
+ ![File Synchronization Architecture](../../assets/file_synchronizer.png)
301
+
302
+ ### How It Works
303
+
304
+ The file synchronization system uses a **Merkle tree-based approach** combined with SHA-256 file hashing to detect changes:
305
+
306
+ #### 1. File Hashing
307
+
308
+ - Each file in the codebase is hashed using SHA-256
309
+ - File hashes are computed based on file content, not metadata
310
+ - Hashes are stored with relative file paths for consistency across different environments
311
+
312
+ #### 2. Merkle Tree Construction
313
+
314
+ - All file hashes are organized into a Merkle tree structure
315
+ - The tree provides a single root hash that represents the entire codebase state
316
+ - Any change to any file will cause the root hash to change
317
+
318
+ #### 3. Snapshot Management
319
+
320
+ - File synchronization state is persisted to `~/.context/merkle/` directory
321
+ - Each codebase gets a unique snapshot file based on its absolute path hash
322
+ - Snapshots contain both file hashes and serialized Merkle tree data
323
+
324
+ #### 4. Change Detection Process
325
+
326
+ 1. **Quick Check**: Compare current Merkle root hash with stored snapshot
327
+ 2. **Detailed Analysis**: If root hashes differ, perform file-by-file comparison
328
+ 3. **Change Classification**: Categorize changes into three types:
329
+ - **Added**: New files that didn't exist before
330
+ - **Modified**: Existing files with changed content
331
+ - **Removed**: Files that were deleted from the codebase
332
+
333
+ #### 5. Incremental Updates
334
+
335
+ - Only process files that have actually changed
336
+ - Update vector database entries only for modified chunks
337
+ - Remove entries for deleted files
338
+ - Add entries for new files
339
+
340
+ ## Contributing
341
+
342
+ This package is part of the Claude Context monorepo. Please see:
343
+
344
+ - [Main Contributing Guide](../../CONTRIBUTING.md) - General contribution guidelines
345
+ - [Core Package Contributing](CONTRIBUTING.md) - Specific development guide for this package
346
+
347
+ ## Related Packages
348
+
349
+ - **[@claude-context/mcp](../mcp)** - MCP server that uses this core engine
350
+ - **[VSCode Extension](../vscode-extension)** - VSCode extension built on this core
351
+
352
+ ## License
353
+
354
+ MIT - See [LICENSE](../../LICENSE) for details