@pleaseai/context-please-core 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +24 -0
- package/README.md +287 -0
- package/dist/.tsbuildinfo +1 -0
- package/dist/context.d.ts +276 -0
- package/dist/context.d.ts.map +1 -0
- package/dist/context.js +1072 -0
- package/dist/context.js.map +1 -0
- package/dist/embedding/base-embedding.d.ts +51 -0
- package/dist/embedding/base-embedding.d.ts.map +1 -0
- package/dist/embedding/base-embedding.js +36 -0
- package/dist/embedding/base-embedding.js.map +1 -0
- package/dist/embedding/gemini-embedding.d.ts +53 -0
- package/dist/embedding/gemini-embedding.d.ts.map +1 -0
- package/dist/embedding/gemini-embedding.js +152 -0
- package/dist/embedding/gemini-embedding.js.map +1 -0
- package/dist/embedding/index.d.ts +6 -0
- package/dist/embedding/index.d.ts.map +1 -0
- package/dist/embedding/index.js +24 -0
- package/dist/embedding/index.js.map +1 -0
- package/dist/embedding/ollama-embedding.d.ts +55 -0
- package/dist/embedding/ollama-embedding.d.ts.map +1 -0
- package/dist/embedding/ollama-embedding.js +192 -0
- package/dist/embedding/ollama-embedding.js.map +1 -0
- package/dist/embedding/openai-embedding.d.ts +36 -0
- package/dist/embedding/openai-embedding.d.ts.map +1 -0
- package/dist/embedding/openai-embedding.js +159 -0
- package/dist/embedding/openai-embedding.js.map +1 -0
- package/dist/embedding/voyageai-embedding.d.ts +44 -0
- package/dist/embedding/voyageai-embedding.d.ts.map +1 -0
- package/dist/embedding/voyageai-embedding.js +227 -0
- package/dist/embedding/voyageai-embedding.js.map +1 -0
- package/dist/index.d.ts +8 -0
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +24 -0
- package/dist/index.js.map +1 -0
- package/dist/splitter/ast-splitter.d.ts +22 -0
- package/dist/splitter/ast-splitter.d.ts.map +1 -0
- package/dist/splitter/ast-splitter.js +234 -0
- package/dist/splitter/ast-splitter.js.map +1 -0
- package/dist/splitter/index.d.ts +41 -0
- package/dist/splitter/index.d.ts.map +1 -0
- package/dist/splitter/index.js +27 -0
- package/dist/splitter/index.js.map +1 -0
- package/dist/splitter/langchain-splitter.d.ts +13 -0
- package/dist/splitter/langchain-splitter.d.ts.map +1 -0
- package/dist/splitter/langchain-splitter.js +118 -0
- package/dist/splitter/langchain-splitter.js.map +1 -0
- package/dist/sync/merkle.d.ts +26 -0
- package/dist/sync/merkle.d.ts.map +1 -0
- package/dist/sync/merkle.js +112 -0
- package/dist/sync/merkle.js.map +1 -0
- package/dist/sync/synchronizer.d.ts +30 -0
- package/dist/sync/synchronizer.d.ts.map +1 -0
- package/dist/sync/synchronizer.js +339 -0
- package/dist/sync/synchronizer.js.map +1 -0
- package/dist/types.d.ts +14 -0
- package/dist/types.d.ts.map +1 -0
- package/dist/types.js +3 -0
- package/dist/types.js.map +1 -0
- package/dist/utils/env-manager.d.ts +19 -0
- package/dist/utils/env-manager.d.ts.map +1 -0
- package/dist/utils/env-manager.js +125 -0
- package/dist/utils/env-manager.js.map +1 -0
- package/dist/utils/index.d.ts +2 -0
- package/dist/utils/index.d.ts.map +1 -0
- package/dist/utils/index.js +7 -0
- package/dist/utils/index.js.map +1 -0
- package/dist/vectordb/base/base-vector-database.d.ts +58 -0
- package/dist/vectordb/base/base-vector-database.d.ts.map +1 -0
- package/dist/vectordb/base/base-vector-database.js +32 -0
- package/dist/vectordb/base/base-vector-database.js.map +1 -0
- package/dist/vectordb/factory.d.ts +80 -0
- package/dist/vectordb/factory.d.ts.map +1 -0
- package/dist/vectordb/factory.js +89 -0
- package/dist/vectordb/factory.js.map +1 -0
- package/dist/vectordb/index.d.ts +12 -0
- package/dist/vectordb/index.d.ts.map +1 -0
- package/dist/vectordb/index.js +27 -0
- package/dist/vectordb/index.js.map +1 -0
- package/dist/vectordb/milvus-restful-vectordb.d.ts +75 -0
- package/dist/vectordb/milvus-restful-vectordb.d.ts.map +1 -0
- package/dist/vectordb/milvus-restful-vectordb.js +707 -0
- package/dist/vectordb/milvus-restful-vectordb.js.map +1 -0
- package/dist/vectordb/milvus-vectordb.d.ts +59 -0
- package/dist/vectordb/milvus-vectordb.d.ts.map +1 -0
- package/dist/vectordb/milvus-vectordb.js +641 -0
- package/dist/vectordb/milvus-vectordb.js.map +1 -0
- package/dist/vectordb/qdrant-vectordb.d.ts +124 -0
- package/dist/vectordb/qdrant-vectordb.d.ts.map +1 -0
- package/dist/vectordb/qdrant-vectordb.js +582 -0
- package/dist/vectordb/qdrant-vectordb.js.map +1 -0
- package/dist/vectordb/sparse/index.d.ts +4 -0
- package/dist/vectordb/sparse/index.d.ts.map +1 -0
- package/dist/vectordb/sparse/index.js +23 -0
- package/dist/vectordb/sparse/index.js.map +1 -0
- package/dist/vectordb/sparse/simple-bm25.d.ts +104 -0
- package/dist/vectordb/sparse/simple-bm25.d.ts.map +1 -0
- package/dist/vectordb/sparse/simple-bm25.js +189 -0
- package/dist/vectordb/sparse/simple-bm25.js.map +1 -0
- package/dist/vectordb/sparse/sparse-vector-generator.d.ts +54 -0
- package/dist/vectordb/sparse/sparse-vector-generator.d.ts.map +1 -0
- package/dist/vectordb/sparse/sparse-vector-generator.js +3 -0
- package/dist/vectordb/sparse/sparse-vector-generator.js.map +1 -0
- package/dist/vectordb/sparse/types.d.ts +38 -0
- package/dist/vectordb/sparse/types.d.ts.map +1 -0
- package/dist/vectordb/sparse/types.js +3 -0
- package/dist/vectordb/sparse/types.js.map +1 -0
- package/dist/vectordb/types.d.ts +120 -0
- package/dist/vectordb/types.d.ts.map +1 -0
- package/dist/vectordb/types.js +9 -0
- package/dist/vectordb/types.js.map +1 -0
- package/dist/vectordb/zilliz-utils.d.ts +135 -0
- package/dist/vectordb/zilliz-utils.d.ts.map +1 -0
- package/dist/vectordb/zilliz-utils.js +192 -0
- package/dist/vectordb/zilliz-utils.js.map +1 -0
- package/package.json +61 -0
package/LICENSE
ADDED
@@ -0,0 +1,24 @@
|
|
1
|
+
MIT License
|
2
|
+
|
3
|
+
Copyright (c) 2025 PleaseAI
|
4
|
+
|
5
|
+
This project is a fork of claude-context (https://github.com/zilliztech/claude-context)
|
6
|
+
Original Copyright (c) 2025 Zilliz
|
7
|
+
|
8
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
9
|
+
of this software and associated documentation files (the "Software"), to deal
|
10
|
+
in the Software without restriction, including without limitation the rights
|
11
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
12
|
+
copies of the Software, and to permit persons to whom the Software is
|
13
|
+
furnished to do so, subject to the following conditions:
|
14
|
+
|
15
|
+
The above copyright notice and this permission notice shall be included in all
|
16
|
+
copies or substantial portions of the Software.
|
17
|
+
|
18
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
19
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
20
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
21
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
22
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
23
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
24
|
+
SOFTWARE.
|
package/README.md
ADDED
@@ -0,0 +1,287 @@
|
|
1
|
+
# @pleaseai/context-please-core
|
2
|
+

|
3
|
+
|
4
|
+
The core indexing engine for Context Please - a powerful tool for semantic search and analysis of codebases using vector embeddings and AI.
|
5
|
+
|
6
|
+
> **Note:** This is a fork of [@zilliz/claude-context-core](https://www.npmjs.com/package/@zilliz/claude-context-core) by Zilliz, maintained by PleaseAI.
|
7
|
+
|
8
|
+
[](https://www.npmjs.com/package/@pleaseai/context-please-core)
|
9
|
+
[](https://www.npmjs.com/package/@pleaseai/context-please-core)
|
10
|
+
|
11
|
+
> 📖 **New to Context Please?** Check out the [main project README](../../README.md) for an overview and quick start guide.
|
12
|
+
|
13
|
+
## Installation
|
14
|
+
|
15
|
+
```bash
|
16
|
+
npm install @pleaseai/context-please-core
|
17
|
+
```
|
18
|
+
|
19
|
+
### Prepare Environment Variables
|
20
|
+
#### OpenAI API key
|
21
|
+
See [OpenAI Documentation](https://platform.openai.com/docs/api-reference) for more details to get your API key.
|
22
|
+
```bash
|
23
|
+
OPENAI_API_KEY=your-openai-api-key
|
24
|
+
```
|
25
|
+
|
26
|
+
#### Zilliz Cloud configuration
|
27
|
+
Get a free Milvus vector database on Zilliz Cloud.
|
28
|
+
|
29
|
+
Claude Context needs a vector database. You can [sign up](https://cloud.zilliz.com/signup?utm_source=github&utm_medium=referral&utm_campaign=2507-codecontext-readme) on Zilliz Cloud to get a free Serverless cluster.
|
30
|
+
|
31
|
+

|
32
|
+
|
33
|
+
After creating your cluster, open your Zilliz Cloud console and copy both the **public endpoint** and your **API key**.
|
34
|
+
These will be used as `your-zilliz-cloud-public-endpoint` and `your-zilliz-cloud-api-key` in the configuration examples.
|
35
|
+
|
36
|
+

|
37
|
+
|
38
|
+
Keep both values handy for the configuration steps below.
|
39
|
+
|
40
|
+
If you need help creating your free vector database or finding these values, see the [Zilliz Cloud documentation](https://docs.zilliz.com/docs/create-cluster) for detailed instructions.
|
41
|
+
|
42
|
+
```bash
|
43
|
+
MILVUS_ADDRESS=your-zilliz-cloud-public-endpoint
|
44
|
+
MILVUS_TOKEN=your-zilliz-cloud-api-key
|
45
|
+
```
|
46
|
+
|
47
|
+
> 💡 **Tip**: For easier configuration management across different usage scenarios, consider using [global environment variables](../../docs/getting-started/environment-variables.md).
|
48
|
+
|
49
|
+
## Quick Start
|
50
|
+
|
51
|
+
```typescript
|
52
|
+
import {
|
53
|
+
Context,
|
54
|
+
OpenAIEmbedding,
|
55
|
+
MilvusVectorDatabase
|
56
|
+
} from '@pleaseai/context-please-core';
|
57
|
+
|
58
|
+
// Initialize embedding provider
|
59
|
+
const embedding = new OpenAIEmbedding({
|
60
|
+
apiKey: process.env.OPENAI_API_KEY || 'your-openai-api-key',
|
61
|
+
model: 'text-embedding-3-small'
|
62
|
+
});
|
63
|
+
|
64
|
+
// Initialize vector database
|
65
|
+
const vectorDatabase = new MilvusVectorDatabase({
|
66
|
+
address: process.env.MILVUS_ADDRESS || 'localhost:19530',
|
67
|
+
token: process.env.MILVUS_TOKEN || ''
|
68
|
+
});
|
69
|
+
|
70
|
+
// Create context instance
|
71
|
+
const context = new Context({
|
72
|
+
embedding,
|
73
|
+
vectorDatabase
|
74
|
+
});
|
75
|
+
|
76
|
+
// Index a codebase
|
77
|
+
const stats = await context.indexCodebase('./my-project', (progress) => {
|
78
|
+
console.log(`${progress.phase} - ${progress.percentage}%`);
|
79
|
+
});
|
80
|
+
|
81
|
+
console.log(`Indexed ${stats.indexedFiles} files with ${stats.totalChunks} chunks`);
|
82
|
+
|
83
|
+
// Search the codebase
|
84
|
+
const results = await context.semanticSearch(
|
85
|
+
'./my-project',
|
86
|
+
'function that handles user authentication',
|
87
|
+
5
|
88
|
+
);
|
89
|
+
|
90
|
+
results.forEach(result => {
|
91
|
+
console.log(`${result.relativePath}:${result.startLine}-${result.endLine}`);
|
92
|
+
console.log(`Score: ${result.score}`);
|
93
|
+
console.log(result.content);
|
94
|
+
});
|
95
|
+
```
|
96
|
+
|
97
|
+
## Features
|
98
|
+
|
99
|
+
- **Multi-language Support**: Index TypeScript, JavaScript, Python, Java, C++, and many other programming languages
|
100
|
+
- **Semantic Search**: Find code using natural language queries powered by AI embeddings
|
101
|
+
- **Flexible Architecture**: Pluggable embedding providers and vector databases
|
102
|
+
- **Smart Chunking**: Intelligent code splitting that preserves context and structure
|
103
|
+
- **Batch Processing**: Efficient processing of large codebases with progress tracking
|
104
|
+
- **Pattern Matching**: Built-in ignore patterns for common build artifacts and dependencies
|
105
|
+
- **Incremental File Synchronization**: Efficient change detection using Merkle trees to only re-index modified files
|
106
|
+
|
107
|
+
## Embedding Providers
|
108
|
+
|
109
|
+
- **OpenAI Embeddings** (`text-embedding-3-small`, `text-embedding-3-large`, `text-embedding-ada-002`)
|
110
|
+
- **VoyageAI Embeddings** - High-quality embeddings optimized for code (`voyage-code-3`, `voyage-3.5`, etc.)
|
111
|
+
- **Gemini Embeddings** - Google's embedding models (`gemini-embedding-001`)
|
112
|
+
- **Ollama Embeddings** - Local embedding models via Ollama
|
113
|
+
|
114
|
+
## Vector Database Support
|
115
|
+
|
116
|
+
- **Milvus/Zilliz Cloud** - High-performance vector database
|
117
|
+
|
118
|
+
## Code Splitters
|
119
|
+
|
120
|
+
- **AST Code Splitter** - AST-based code splitting with automatic fallback (default)
|
121
|
+
- **LangChain Code Splitter** - Character-based code chunking
|
122
|
+
|
123
|
+
## Configuration
|
124
|
+
|
125
|
+
### ContextConfig
|
126
|
+
|
127
|
+
```typescript
|
128
|
+
interface ContextConfig {
|
129
|
+
embedding?: Embedding; // Embedding provider
|
130
|
+
vectorDatabase?: VectorDatabase; // Vector database instance (required)
|
131
|
+
codeSplitter?: Splitter; // Code splitting strategy
|
132
|
+
supportedExtensions?: string[]; // File extensions to index
|
133
|
+
ignorePatterns?: string[]; // Patterns to ignore
|
134
|
+
customExtensions?: string[]; // Custom extensions from MCP
|
135
|
+
customIgnorePatterns?: string[]; // Custom ignore patterns from MCP
|
136
|
+
}
|
137
|
+
```
|
138
|
+
|
139
|
+
### Supported File Extensions (Default)
|
140
|
+
|
141
|
+
```typescript
|
142
|
+
[
|
143
|
+
// Programming languages
|
144
|
+
'.ts', '.tsx', '.js', '.jsx', '.py', '.java', '.cpp', '.c', '.h', '.hpp',
|
145
|
+
'.cs', '.go', '.rs', '.php', '.rb', '.swift', '.kt', '.scala', '.m', '.mm',
|
146
|
+
// Text and markup files
|
147
|
+
'.md', '.markdown', '.ipynb'
|
148
|
+
]
|
149
|
+
```
|
150
|
+
|
151
|
+
### Default Ignore Patterns
|
152
|
+
|
153
|
+
- Build and dependency directories: `node_modules/**`, `dist/**`, `build/**`, `out/**`, `target/**`
|
154
|
+
- Version control: `.git/**`, `.svn/**`, `.hg/**`
|
155
|
+
- IDE files: `.vscode/**`, `.idea/**`, `*.swp`, `*.swo`
|
156
|
+
- Cache directories: `.cache/**`, `__pycache__/**`, `.pytest_cache/**`, `coverage/**`
|
157
|
+
- Minified files: `*.min.js`, `*.min.css`, `*.bundle.js`, `*.map`
|
158
|
+
- Log and temp files: `logs/**`, `tmp/**`, `temp/**`, `*.log`
|
159
|
+
- Environment files: `.env`, `.env.*`, `*.local`
|
160
|
+
|
161
|
+
## API Reference
|
162
|
+
|
163
|
+
### Context
|
164
|
+
|
165
|
+
#### Methods
|
166
|
+
|
167
|
+
- `indexCodebase(path, progressCallback?, forceReindex?)` - Index an entire codebase
|
168
|
+
- `reindexByChange(path, progressCallback?)` - Incrementally re-index only changed files
|
169
|
+
- `semanticSearch(path, query, topK?, threshold?, filterExpr?)` - Search indexed code semantically
|
170
|
+
- `hasIndex(path)` - Check if codebase is already indexed
|
171
|
+
- `clearIndex(path, progressCallback?)` - Remove index for a codebase
|
172
|
+
- `updateIgnorePatterns(patterns)` - Update ignore patterns
|
173
|
+
- `addCustomIgnorePatterns(patterns)` - Add custom ignore patterns
|
174
|
+
- `addCustomExtensions(extensions)` - Add custom file extensions
|
175
|
+
- `updateEmbedding(embedding)` - Switch embedding provider
|
176
|
+
- `updateVectorDatabase(vectorDB)` - Switch vector database
|
177
|
+
- `updateSplitter(splitter)` - Switch code splitter
|
178
|
+
|
179
|
+
### Search Results
|
180
|
+
|
181
|
+
```typescript
|
182
|
+
interface SemanticSearchResult {
|
183
|
+
content: string; // Code content
|
184
|
+
relativePath: string; // File path relative to codebase root
|
185
|
+
startLine: number; // Starting line number
|
186
|
+
endLine: number; // Ending line number
|
187
|
+
language: string; // Programming language
|
188
|
+
score: number; // Similarity score (0-1)
|
189
|
+
}
|
190
|
+
```
|
191
|
+
|
192
|
+
|
193
|
+
## Examples
|
194
|
+
|
195
|
+
### Using VoyageAI Embeddings
|
196
|
+
|
197
|
+
```typescript
|
198
|
+
import { Context, MilvusVectorDatabase, VoyageAIEmbedding } from '@pleaseai/context-please-core';
|
199
|
+
|
200
|
+
// Initialize with VoyageAI embedding provider
|
201
|
+
const embedding = new VoyageAIEmbedding({
|
202
|
+
apiKey: process.env.VOYAGEAI_API_KEY || 'your-voyageai-api-key',
|
203
|
+
model: 'voyage-code-3'
|
204
|
+
});
|
205
|
+
|
206
|
+
const vectorDatabase = new MilvusVectorDatabase({
|
207
|
+
address: process.env.MILVUS_ADDRESS || 'localhost:19530',
|
208
|
+
token: process.env.MILVUS_TOKEN || ''
|
209
|
+
});
|
210
|
+
|
211
|
+
const context = new Context({
|
212
|
+
embedding,
|
213
|
+
vectorDatabase
|
214
|
+
});
|
215
|
+
```
|
216
|
+
|
217
|
+
### Custom File Filtering
|
218
|
+
|
219
|
+
```typescript
|
220
|
+
const context = new Context({
|
221
|
+
embedding,
|
222
|
+
vectorDatabase,
|
223
|
+
supportedExtensions: ['.ts', '.js', '.py', '.java'],
|
224
|
+
ignorePatterns: [
|
225
|
+
'node_modules/**',
|
226
|
+
'dist/**',
|
227
|
+
'*.spec.ts',
|
228
|
+
'*.test.js'
|
229
|
+
]
|
230
|
+
});
|
231
|
+
```
|
232
|
+
|
233
|
+
## File Synchronization Architecture
|
234
|
+
|
235
|
+
Claude Context implements an intelligent file synchronization system that efficiently tracks and processes only the files that have changed since the last indexing operation. This dramatically improves performance when working with large codebases.
|
236
|
+
|
237
|
+

|
238
|
+
|
239
|
+
### How It Works
|
240
|
+
|
241
|
+
The file synchronization system uses a **Merkle tree-based approach** combined with SHA-256 file hashing to detect changes:
|
242
|
+
|
243
|
+
#### 1. File Hashing
|
244
|
+
- Each file in the codebase is hashed using SHA-256
|
245
|
+
- File hashes are computed based on file content, not metadata
|
246
|
+
- Hashes are stored with relative file paths for consistency across different environments
|
247
|
+
|
248
|
+
#### 2. Merkle Tree Construction
|
249
|
+
- All file hashes are organized into a Merkle tree structure
|
250
|
+
- The tree provides a single root hash that represents the entire codebase state
|
251
|
+
- Any change to any file will cause the root hash to change
|
252
|
+
|
253
|
+
#### 3. Snapshot Management
|
254
|
+
- File synchronization state is persisted to `~/.context/merkle/` directory
|
255
|
+
- Each codebase gets a unique snapshot file based on its absolute path hash
|
256
|
+
- Snapshots contain both file hashes and serialized Merkle tree data
|
257
|
+
|
258
|
+
#### 4. Change Detection Process
|
259
|
+
1. **Quick Check**: Compare current Merkle root hash with stored snapshot
|
260
|
+
2. **Detailed Analysis**: If root hashes differ, perform file-by-file comparison
|
261
|
+
3. **Change Classification**: Categorize changes into three types:
|
262
|
+
- **Added**: New files that didn't exist before
|
263
|
+
- **Modified**: Existing files with changed content
|
264
|
+
- **Removed**: Files that were deleted from the codebase
|
265
|
+
|
266
|
+
#### 5. Incremental Updates
|
267
|
+
- Only process files that have actually changed
|
268
|
+
- Update vector database entries only for modified chunks
|
269
|
+
- Remove entries for deleted files
|
270
|
+
- Add entries for new files
|
271
|
+
|
272
|
+
|
273
|
+
## Contributing
|
274
|
+
|
275
|
+
This package is part of the Claude Context monorepo. Please see:
|
276
|
+
- [Main Contributing Guide](../../CONTRIBUTING.md) - General contribution guidelines
|
277
|
+
- [Core Package Contributing](CONTRIBUTING.md) - Specific development guide for this package
|
278
|
+
|
279
|
+
## Related Packages
|
280
|
+
|
281
|
+
- **[@claude-context/mcp](../mcp)** - MCP server that uses this core engine
|
282
|
+
- **[VSCode Extension](../vscode-extension)** - VSCode extension built on this core
|
283
|
+
|
284
|
+
|
285
|
+
## License
|
286
|
+
|
287
|
+
MIT - See [LICENSE](../../LICENSE) for details
|