@pleaseai/context-please-core 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (116) hide show
  1. package/LICENSE +24 -0
  2. package/README.md +287 -0
  3. package/dist/.tsbuildinfo +1 -0
  4. package/dist/context.d.ts +276 -0
  5. package/dist/context.d.ts.map +1 -0
  6. package/dist/context.js +1072 -0
  7. package/dist/context.js.map +1 -0
  8. package/dist/embedding/base-embedding.d.ts +51 -0
  9. package/dist/embedding/base-embedding.d.ts.map +1 -0
  10. package/dist/embedding/base-embedding.js +36 -0
  11. package/dist/embedding/base-embedding.js.map +1 -0
  12. package/dist/embedding/gemini-embedding.d.ts +53 -0
  13. package/dist/embedding/gemini-embedding.d.ts.map +1 -0
  14. package/dist/embedding/gemini-embedding.js +152 -0
  15. package/dist/embedding/gemini-embedding.js.map +1 -0
  16. package/dist/embedding/index.d.ts +6 -0
  17. package/dist/embedding/index.d.ts.map +1 -0
  18. package/dist/embedding/index.js +24 -0
  19. package/dist/embedding/index.js.map +1 -0
  20. package/dist/embedding/ollama-embedding.d.ts +55 -0
  21. package/dist/embedding/ollama-embedding.d.ts.map +1 -0
  22. package/dist/embedding/ollama-embedding.js +192 -0
  23. package/dist/embedding/ollama-embedding.js.map +1 -0
  24. package/dist/embedding/openai-embedding.d.ts +36 -0
  25. package/dist/embedding/openai-embedding.d.ts.map +1 -0
  26. package/dist/embedding/openai-embedding.js +159 -0
  27. package/dist/embedding/openai-embedding.js.map +1 -0
  28. package/dist/embedding/voyageai-embedding.d.ts +44 -0
  29. package/dist/embedding/voyageai-embedding.d.ts.map +1 -0
  30. package/dist/embedding/voyageai-embedding.js +227 -0
  31. package/dist/embedding/voyageai-embedding.js.map +1 -0
  32. package/dist/index.d.ts +8 -0
  33. package/dist/index.d.ts.map +1 -0
  34. package/dist/index.js +24 -0
  35. package/dist/index.js.map +1 -0
  36. package/dist/splitter/ast-splitter.d.ts +22 -0
  37. package/dist/splitter/ast-splitter.d.ts.map +1 -0
  38. package/dist/splitter/ast-splitter.js +234 -0
  39. package/dist/splitter/ast-splitter.js.map +1 -0
  40. package/dist/splitter/index.d.ts +41 -0
  41. package/dist/splitter/index.d.ts.map +1 -0
  42. package/dist/splitter/index.js +27 -0
  43. package/dist/splitter/index.js.map +1 -0
  44. package/dist/splitter/langchain-splitter.d.ts +13 -0
  45. package/dist/splitter/langchain-splitter.d.ts.map +1 -0
  46. package/dist/splitter/langchain-splitter.js +118 -0
  47. package/dist/splitter/langchain-splitter.js.map +1 -0
  48. package/dist/sync/merkle.d.ts +26 -0
  49. package/dist/sync/merkle.d.ts.map +1 -0
  50. package/dist/sync/merkle.js +112 -0
  51. package/dist/sync/merkle.js.map +1 -0
  52. package/dist/sync/synchronizer.d.ts +30 -0
  53. package/dist/sync/synchronizer.d.ts.map +1 -0
  54. package/dist/sync/synchronizer.js +339 -0
  55. package/dist/sync/synchronizer.js.map +1 -0
  56. package/dist/types.d.ts +14 -0
  57. package/dist/types.d.ts.map +1 -0
  58. package/dist/types.js +3 -0
  59. package/dist/types.js.map +1 -0
  60. package/dist/utils/env-manager.d.ts +19 -0
  61. package/dist/utils/env-manager.d.ts.map +1 -0
  62. package/dist/utils/env-manager.js +125 -0
  63. package/dist/utils/env-manager.js.map +1 -0
  64. package/dist/utils/index.d.ts +2 -0
  65. package/dist/utils/index.d.ts.map +1 -0
  66. package/dist/utils/index.js +7 -0
  67. package/dist/utils/index.js.map +1 -0
  68. package/dist/vectordb/base/base-vector-database.d.ts +58 -0
  69. package/dist/vectordb/base/base-vector-database.d.ts.map +1 -0
  70. package/dist/vectordb/base/base-vector-database.js +32 -0
  71. package/dist/vectordb/base/base-vector-database.js.map +1 -0
  72. package/dist/vectordb/factory.d.ts +80 -0
  73. package/dist/vectordb/factory.d.ts.map +1 -0
  74. package/dist/vectordb/factory.js +89 -0
  75. package/dist/vectordb/factory.js.map +1 -0
  76. package/dist/vectordb/index.d.ts +12 -0
  77. package/dist/vectordb/index.d.ts.map +1 -0
  78. package/dist/vectordb/index.js +27 -0
  79. package/dist/vectordb/index.js.map +1 -0
  80. package/dist/vectordb/milvus-restful-vectordb.d.ts +75 -0
  81. package/dist/vectordb/milvus-restful-vectordb.d.ts.map +1 -0
  82. package/dist/vectordb/milvus-restful-vectordb.js +707 -0
  83. package/dist/vectordb/milvus-restful-vectordb.js.map +1 -0
  84. package/dist/vectordb/milvus-vectordb.d.ts +59 -0
  85. package/dist/vectordb/milvus-vectordb.d.ts.map +1 -0
  86. package/dist/vectordb/milvus-vectordb.js +641 -0
  87. package/dist/vectordb/milvus-vectordb.js.map +1 -0
  88. package/dist/vectordb/qdrant-vectordb.d.ts +124 -0
  89. package/dist/vectordb/qdrant-vectordb.d.ts.map +1 -0
  90. package/dist/vectordb/qdrant-vectordb.js +582 -0
  91. package/dist/vectordb/qdrant-vectordb.js.map +1 -0
  92. package/dist/vectordb/sparse/index.d.ts +4 -0
  93. package/dist/vectordb/sparse/index.d.ts.map +1 -0
  94. package/dist/vectordb/sparse/index.js +23 -0
  95. package/dist/vectordb/sparse/index.js.map +1 -0
  96. package/dist/vectordb/sparse/simple-bm25.d.ts +104 -0
  97. package/dist/vectordb/sparse/simple-bm25.d.ts.map +1 -0
  98. package/dist/vectordb/sparse/simple-bm25.js +189 -0
  99. package/dist/vectordb/sparse/simple-bm25.js.map +1 -0
  100. package/dist/vectordb/sparse/sparse-vector-generator.d.ts +54 -0
  101. package/dist/vectordb/sparse/sparse-vector-generator.d.ts.map +1 -0
  102. package/dist/vectordb/sparse/sparse-vector-generator.js +3 -0
  103. package/dist/vectordb/sparse/sparse-vector-generator.js.map +1 -0
  104. package/dist/vectordb/sparse/types.d.ts +38 -0
  105. package/dist/vectordb/sparse/types.d.ts.map +1 -0
  106. package/dist/vectordb/sparse/types.js +3 -0
  107. package/dist/vectordb/sparse/types.js.map +1 -0
  108. package/dist/vectordb/types.d.ts +120 -0
  109. package/dist/vectordb/types.d.ts.map +1 -0
  110. package/dist/vectordb/types.js +9 -0
  111. package/dist/vectordb/types.js.map +1 -0
  112. package/dist/vectordb/zilliz-utils.d.ts +135 -0
  113. package/dist/vectordb/zilliz-utils.d.ts.map +1 -0
  114. package/dist/vectordb/zilliz-utils.js +192 -0
  115. package/dist/vectordb/zilliz-utils.js.map +1 -0
  116. package/package.json +61 -0
package/LICENSE ADDED
@@ -0,0 +1,24 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 PleaseAI
4
+
5
+ This project is a fork of claude-context (https://github.com/zilliztech/claude-context)
6
+ Original Copyright (c) 2025 Zilliz
7
+
8
+ Permission is hereby granted, free of charge, to any person obtaining a copy
9
+ of this software and associated documentation files (the "Software"), to deal
10
+ in the Software without restriction, including without limitation the rights
11
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
12
+ copies of the Software, and to permit persons to whom the Software is
13
+ furnished to do so, subject to the following conditions:
14
+
15
+ The above copyright notice and this permission notice shall be included in all
16
+ copies or substantial portions of the Software.
17
+
18
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
19
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
20
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
21
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
22
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
23
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
24
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,287 @@
1
+ # @pleaseai/context-please-core
2
+ ![](../../assets/claude-context.png)
3
+
4
+ The core indexing engine for Context Please - a powerful tool for semantic search and analysis of codebases using vector embeddings and AI.
5
+
6
+ > **Note:** This is a fork of [@zilliz/claude-context-core](https://www.npmjs.com/package/@zilliz/claude-context-core) by Zilliz, maintained by PleaseAI.
7
+
8
+ [![npm version](https://img.shields.io/npm/v/@pleaseai/context-please-core.svg)](https://www.npmjs.com/package/@pleaseai/context-please-core)
9
+ [![npm downloads](https://img.shields.io/npm/dm/@pleaseai/context-please-core.svg)](https://www.npmjs.com/package/@pleaseai/context-please-core)
10
+
11
+ > 📖 **New to Context Please?** Check out the [main project README](../../README.md) for an overview and quick start guide.
12
+
13
+ ## Installation
14
+
15
+ ```bash
16
+ npm install @pleaseai/context-please-core
17
+ ```
18
+
19
+ ### Prepare Environment Variables
20
+ #### OpenAI API key
21
+ See [OpenAI Documentation](https://platform.openai.com/docs/api-reference) for more details to get your API key.
22
+ ```bash
23
+ OPENAI_API_KEY=your-openai-api-key
24
+ ```
25
+
26
+ #### Zilliz Cloud configuration
27
+ Get a free Milvus vector database on Zilliz Cloud.
28
+
29
+ Claude Context needs a vector database. You can [sign up](https://cloud.zilliz.com/signup?utm_source=github&utm_medium=referral&utm_campaign=2507-codecontext-readme) on Zilliz Cloud to get a free Serverless cluster.
30
+
31
+ ![](../../assets/signup_and_create_cluster.jpeg)
32
+
33
+ After creating your cluster, open your Zilliz Cloud console and copy both the **public endpoint** and your **API key**.
34
+ These will be used as `your-zilliz-cloud-public-endpoint` and `your-zilliz-cloud-api-key` in the configuration examples.
35
+
36
+ ![Zilliz Cloud Dashboard](../../assets/zilliz_cloud_dashboard.jpeg)
37
+
38
+ Keep both values handy for the configuration steps below.
39
+
40
+ If you need help creating your free vector database or finding these values, see the [Zilliz Cloud documentation](https://docs.zilliz.com/docs/create-cluster) for detailed instructions.
41
+
42
+ ```bash
43
+ MILVUS_ADDRESS=your-zilliz-cloud-public-endpoint
44
+ MILVUS_TOKEN=your-zilliz-cloud-api-key
45
+ ```
46
+
47
+ > 💡 **Tip**: For easier configuration management across different usage scenarios, consider using [global environment variables](../../docs/getting-started/environment-variables.md).
48
+
49
+ ## Quick Start
50
+
51
+ ```typescript
52
+ import {
53
+ Context,
54
+ OpenAIEmbedding,
55
+ MilvusVectorDatabase
56
+ } from '@pleaseai/context-please-core';
57
+
58
+ // Initialize embedding provider
59
+ const embedding = new OpenAIEmbedding({
60
+ apiKey: process.env.OPENAI_API_KEY || 'your-openai-api-key',
61
+ model: 'text-embedding-3-small'
62
+ });
63
+
64
+ // Initialize vector database
65
+ const vectorDatabase = new MilvusVectorDatabase({
66
+ address: process.env.MILVUS_ADDRESS || 'localhost:19530',
67
+ token: process.env.MILVUS_TOKEN || ''
68
+ });
69
+
70
+ // Create context instance
71
+ const context = new Context({
72
+ embedding,
73
+ vectorDatabase
74
+ });
75
+
76
+ // Index a codebase
77
+ const stats = await context.indexCodebase('./my-project', (progress) => {
78
+ console.log(`${progress.phase} - ${progress.percentage}%`);
79
+ });
80
+
81
+ console.log(`Indexed ${stats.indexedFiles} files with ${stats.totalChunks} chunks`);
82
+
83
+ // Search the codebase
84
+ const results = await context.semanticSearch(
85
+ './my-project',
86
+ 'function that handles user authentication',
87
+ 5
88
+ );
89
+
90
+ results.forEach(result => {
91
+ console.log(`${result.relativePath}:${result.startLine}-${result.endLine}`);
92
+ console.log(`Score: ${result.score}`);
93
+ console.log(result.content);
94
+ });
95
+ ```
96
+
97
+ ## Features
98
+
99
+ - **Multi-language Support**: Index TypeScript, JavaScript, Python, Java, C++, and many other programming languages
100
+ - **Semantic Search**: Find code using natural language queries powered by AI embeddings
101
+ - **Flexible Architecture**: Pluggable embedding providers and vector databases
102
+ - **Smart Chunking**: Intelligent code splitting that preserves context and structure
103
+ - **Batch Processing**: Efficient processing of large codebases with progress tracking
104
+ - **Pattern Matching**: Built-in ignore patterns for common build artifacts and dependencies
105
+ - **Incremental File Synchronization**: Efficient change detection using Merkle trees to only re-index modified files
106
+
107
+ ## Embedding Providers
108
+
109
+ - **OpenAI Embeddings** (`text-embedding-3-small`, `text-embedding-3-large`, `text-embedding-ada-002`)
110
+ - **VoyageAI Embeddings** - High-quality embeddings optimized for code (`voyage-code-3`, `voyage-3.5`, etc.)
111
+ - **Gemini Embeddings** - Google's embedding models (`gemini-embedding-001`)
112
+ - **Ollama Embeddings** - Local embedding models via Ollama
113
+
114
+ ## Vector Database Support
115
+
116
+ - **Milvus/Zilliz Cloud** - High-performance vector database
117
+
118
+ ## Code Splitters
119
+
120
+ - **AST Code Splitter** - AST-based code splitting with automatic fallback (default)
121
+ - **LangChain Code Splitter** - Character-based code chunking
122
+
123
+ ## Configuration
124
+
125
+ ### ContextConfig
126
+
127
+ ```typescript
128
+ interface ContextConfig {
129
+ embedding?: Embedding; // Embedding provider
130
+ vectorDatabase?: VectorDatabase; // Vector database instance (required)
131
+ codeSplitter?: Splitter; // Code splitting strategy
132
+ supportedExtensions?: string[]; // File extensions to index
133
+ ignorePatterns?: string[]; // Patterns to ignore
134
+ customExtensions?: string[]; // Custom extensions from MCP
135
+ customIgnorePatterns?: string[]; // Custom ignore patterns from MCP
136
+ }
137
+ ```
138
+
139
+ ### Supported File Extensions (Default)
140
+
141
+ ```typescript
142
+ [
143
+ // Programming languages
144
+ '.ts', '.tsx', '.js', '.jsx', '.py', '.java', '.cpp', '.c', '.h', '.hpp',
145
+ '.cs', '.go', '.rs', '.php', '.rb', '.swift', '.kt', '.scala', '.m', '.mm',
146
+ // Text and markup files
147
+ '.md', '.markdown', '.ipynb'
148
+ ]
149
+ ```
150
+
151
+ ### Default Ignore Patterns
152
+
153
+ - Build and dependency directories: `node_modules/**`, `dist/**`, `build/**`, `out/**`, `target/**`
154
+ - Version control: `.git/**`, `.svn/**`, `.hg/**`
155
+ - IDE files: `.vscode/**`, `.idea/**`, `*.swp`, `*.swo`
156
+ - Cache directories: `.cache/**`, `__pycache__/**`, `.pytest_cache/**`, `coverage/**`
157
+ - Minified files: `*.min.js`, `*.min.css`, `*.bundle.js`, `*.map`
158
+ - Log and temp files: `logs/**`, `tmp/**`, `temp/**`, `*.log`
159
+ - Environment files: `.env`, `.env.*`, `*.local`
160
+
161
+ ## API Reference
162
+
163
+ ### Context
164
+
165
+ #### Methods
166
+
167
+ - `indexCodebase(path, progressCallback?, forceReindex?)` - Index an entire codebase
168
+ - `reindexByChange(path, progressCallback?)` - Incrementally re-index only changed files
169
+ - `semanticSearch(path, query, topK?, threshold?, filterExpr?)` - Search indexed code semantically
170
+ - `hasIndex(path)` - Check if codebase is already indexed
171
+ - `clearIndex(path, progressCallback?)` - Remove index for a codebase
172
+ - `updateIgnorePatterns(patterns)` - Update ignore patterns
173
+ - `addCustomIgnorePatterns(patterns)` - Add custom ignore patterns
174
+ - `addCustomExtensions(extensions)` - Add custom file extensions
175
+ - `updateEmbedding(embedding)` - Switch embedding provider
176
+ - `updateVectorDatabase(vectorDB)` - Switch vector database
177
+ - `updateSplitter(splitter)` - Switch code splitter
178
+
179
+ ### Search Results
180
+
181
+ ```typescript
182
+ interface SemanticSearchResult {
183
+ content: string; // Code content
184
+ relativePath: string; // File path relative to codebase root
185
+ startLine: number; // Starting line number
186
+ endLine: number; // Ending line number
187
+ language: string; // Programming language
188
+ score: number; // Similarity score (0-1)
189
+ }
190
+ ```
191
+
192
+
193
+ ## Examples
194
+
195
+ ### Using VoyageAI Embeddings
196
+
197
+ ```typescript
198
+ import { Context, MilvusVectorDatabase, VoyageAIEmbedding } from '@pleaseai/context-please-core';
199
+
200
+ // Initialize with VoyageAI embedding provider
201
+ const embedding = new VoyageAIEmbedding({
202
+ apiKey: process.env.VOYAGEAI_API_KEY || 'your-voyageai-api-key',
203
+ model: 'voyage-code-3'
204
+ });
205
+
206
+ const vectorDatabase = new MilvusVectorDatabase({
207
+ address: process.env.MILVUS_ADDRESS || 'localhost:19530',
208
+ token: process.env.MILVUS_TOKEN || ''
209
+ });
210
+
211
+ const context = new Context({
212
+ embedding,
213
+ vectorDatabase
214
+ });
215
+ ```
216
+
217
+ ### Custom File Filtering
218
+
219
+ ```typescript
220
+ const context = new Context({
221
+ embedding,
222
+ vectorDatabase,
223
+ supportedExtensions: ['.ts', '.js', '.py', '.java'],
224
+ ignorePatterns: [
225
+ 'node_modules/**',
226
+ 'dist/**',
227
+ '*.spec.ts',
228
+ '*.test.js'
229
+ ]
230
+ });
231
+ ```
232
+
233
+ ## File Synchronization Architecture
234
+
235
+ Claude Context implements an intelligent file synchronization system that efficiently tracks and processes only the files that have changed since the last indexing operation. This dramatically improves performance when working with large codebases.
236
+
237
+ ![File Synchronization Architecture](../../assets/file_synchronizer.png)
238
+
239
+ ### How It Works
240
+
241
+ The file synchronization system uses a **Merkle tree-based approach** combined with SHA-256 file hashing to detect changes:
242
+
243
+ #### 1. File Hashing
244
+ - Each file in the codebase is hashed using SHA-256
245
+ - File hashes are computed based on file content, not metadata
246
+ - Hashes are stored with relative file paths for consistency across different environments
247
+
248
+ #### 2. Merkle Tree Construction
249
+ - All file hashes are organized into a Merkle tree structure
250
+ - The tree provides a single root hash that represents the entire codebase state
251
+ - Any change to any file will cause the root hash to change
252
+
253
+ #### 3. Snapshot Management
254
+ - File synchronization state is persisted to `~/.context/merkle/` directory
255
+ - Each codebase gets a unique snapshot file based on its absolute path hash
256
+ - Snapshots contain both file hashes and serialized Merkle tree data
257
+
258
+ #### 4. Change Detection Process
259
+ 1. **Quick Check**: Compare current Merkle root hash with stored snapshot
260
+ 2. **Detailed Analysis**: If root hashes differ, perform file-by-file comparison
261
+ 3. **Change Classification**: Categorize changes into three types:
262
+ - **Added**: New files that didn't exist before
263
+ - **Modified**: Existing files with changed content
264
+ - **Removed**: Files that were deleted from the codebase
265
+
266
+ #### 5. Incremental Updates
267
+ - Only process files that have actually changed
268
+ - Update vector database entries only for modified chunks
269
+ - Remove entries for deleted files
270
+ - Add entries for new files
271
+
272
+
273
+ ## Contributing
274
+
275
+ This package is part of the Claude Context monorepo. Please see:
276
+ - [Main Contributing Guide](../../CONTRIBUTING.md) - General contribution guidelines
277
+ - [Core Package Contributing](CONTRIBUTING.md) - Specific development guide for this package
278
+
279
+ ## Related Packages
280
+
281
+ - **[@claude-context/mcp](../mcp)** - MCP server that uses this core engine
282
+ - **[VSCode Extension](../vscode-extension)** - VSCode extension built on this core
283
+
284
+
285
+ ## License
286
+
287
+ MIT - See [LICENSE](../../LICENSE) for details