@ngao/search 1.0.0-alpha.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +232 -0
- package/dist-bundled/main.js +66047 -0
- package/package.json +77 -0
package/README.md
ADDED
|
@@ -0,0 +1,232 @@
|
|
|
1
|
+
# NGAO Search - MCP Server
|
|
2
|
+
|
|
3
|
+
A Model Context Protocol (MCP) server for local code and documentation search with LLM-friendly output.
|
|
4
|
+
|
|
5
|
+
## Features
|
|
6
|
+
|
|
7
|
+
- 🔍 **Multi-Format Indexing**: Python, Markdown, JavaScript/TypeScript, JSON, YAML
|
|
8
|
+
- 🧠 **LLM-Optimized**: Results formatted specifically for AI model consumption
|
|
9
|
+
- 📊 **Smart Ranking**: Multi-factor relevance scoring
|
|
10
|
+
- 🚀 **Fast Local Search**: No network dependency, instant results
|
|
11
|
+
- 📚 **Context-Aware**: Includes surrounding code context in results
|
|
12
|
+
- 🔄 **Incremental Indexing**: Only reindexes changed files
|
|
13
|
+
- 💾 **Persistent Storage**: LanceDB integration for data persistence across restarts
|
|
14
|
+
- 🤖 **Vector Search**: Semantic similarity search with embeddings support
|
|
15
|
+
|
|
16
|
+
## Installation
|
|
17
|
+
|
|
18
|
+
```bash
|
|
19
|
+
npm install
|
|
20
|
+
npm run build
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
## Usage
|
|
24
|
+
|
|
25
|
+
### As a Library
|
|
26
|
+
|
|
27
|
+
```typescript
|
|
28
|
+
import { SearchEngine } from 'ngao-search-mcp';
|
|
29
|
+
import {
|
|
30
|
+
InMemoryBlockRepository,
|
|
31
|
+
InMemoryMetadataRepository,
|
|
32
|
+
} from 'ngao-search-mcp';
|
|
33
|
+
|
|
34
|
+
// Create repositories
|
|
35
|
+
const blockRepo = new InMemoryBlockRepository();
|
|
36
|
+
const metadataRepo = new InMemoryMetadataRepository();
|
|
37
|
+
|
|
38
|
+
// Create search engine
|
|
39
|
+
const engine = new SearchEngine(blockRepo, metadataRepo);
|
|
40
|
+
|
|
41
|
+
// Index a directory
|
|
42
|
+
const stats = await engine.indexDirectory('./src');
|
|
43
|
+
console.log(`Indexed ${stats.totalBlocks} blocks from ${stats.totalFiles} files`);
|
|
44
|
+
|
|
45
|
+
// Search
|
|
46
|
+
const results = await engine.search('authentication handler');
|
|
47
|
+
console.log(`Found ${results.length} results`);
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
### As MCP Server
|
|
51
|
+
|
|
52
|
+
```typescript
|
|
53
|
+
import { McpServer, handleToolCall } from 'ngao-search-mcp';
|
|
54
|
+
|
|
55
|
+
const server = new McpServer();
|
|
56
|
+
|
|
57
|
+
// Get available tools
|
|
58
|
+
const tools = server.getTools();
|
|
59
|
+
|
|
60
|
+
// Handle tool calls
|
|
61
|
+
const result = await handleToolCall(server, 'search', {
|
|
62
|
+
query: 'authentication',
|
|
63
|
+
maxResults: 10,
|
|
64
|
+
});
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
## Architecture
|
|
68
|
+
|
|
69
|
+
### Core Components
|
|
70
|
+
|
|
71
|
+
- **SearchEngine**: Main facade implementing ISearchEngine interface
|
|
72
|
+
- **Parsers**: File type-specific parsers (Python, Markdown, JavaScript, etc.)
|
|
73
|
+
- **Repositories**: Abstract data access layer
|
|
74
|
+
- **Indexing**: Block extraction and index building
|
|
75
|
+
- **Search**: Query parsing, ranking, and result formatting
|
|
76
|
+
- **MCP Server**: Protocol implementation for tool exposure
|
|
77
|
+
|
|
78
|
+
### Design Patterns
|
|
79
|
+
|
|
80
|
+
- **Repository Pattern**: Flexible storage backend (in-memory for tests, SQLite for production)
|
|
81
|
+
- **Facade Pattern**: Single entry point via SearchEngine
|
|
82
|
+
- **Factory Pattern**: Parser selection by file type
|
|
83
|
+
- **Dependency Injection**: Explicit dependency management
|
|
84
|
+
|
|
85
|
+
## Project Structure
|
|
86
|
+
|
|
87
|
+
```
|
|
88
|
+
src/
|
|
89
|
+
├── core/ # Models, types, constants, errors
|
|
90
|
+
├── parsers/ # File type parsers
|
|
91
|
+
├── repositories/ # Data access interfaces & implementations
|
|
92
|
+
├── indexing/ # Block extraction & index building
|
|
93
|
+
├── search/ # Query, ranking, formatting
|
|
94
|
+
├── api/ # SearchEngine facade
|
|
95
|
+
├── mcp/ # MCP server implementation
|
|
96
|
+
└── utils/ # Utility functions
|
|
97
|
+
|
|
98
|
+
tests/
|
|
99
|
+
├── unit/ # Unit tests
|
|
100
|
+
└── integration/ # Integration tests
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
## Coding Standards
|
|
104
|
+
|
|
105
|
+
All code follows strict TypeScript conventions:
|
|
106
|
+
|
|
107
|
+
- ✅ **Type Safety**: No `any` types, strict mode enabled
|
|
108
|
+
- ✅ **Naming**: PascalCase classes, camelCase methods, UPPER_SNAKE_CASE constants
|
|
109
|
+
- ✅ **Architecture**: Clear separation of concerns
|
|
110
|
+
- ✅ **Documentation**: JSDoc comments for public APIs
|
|
111
|
+
- ✅ **Testing**: Unit and integration tests
|
|
112
|
+
|
|
113
|
+
## Configuration
|
|
114
|
+
|
|
115
|
+
Environment variables:
|
|
116
|
+
|
|
117
|
+
```bash
|
|
118
|
+
NODE_ENV=development|production|test
|
|
119
|
+
NGAO_DATA_DIR=~/.ngao-search # LanceDB storage directory (default: ~/.ngao-search)
|
|
120
|
+
MCP_TRANSPORT=stdio|http|both # Transport mode (default: both)
|
|
121
|
+
PORT=3000 # REST API port (default: 3000)
|
|
122
|
+
NGAO_CACHE_TTL_SECONDS=3600
|
|
123
|
+
NGAO_MAX_INDEX_SIZE_MB=500
|
|
124
|
+
NGAO_CONTEXT_LINES=10
|
|
125
|
+
NGAO_MAX_RESULTS=50
|
|
126
|
+
```
|
|
127
|
+
|
|
128
|
+
### Storage
|
|
129
|
+
|
|
130
|
+
NGAO Search uses **LanceDB** for persistent storage:
|
|
131
|
+
|
|
132
|
+
```bash
|
|
133
|
+
~/.ngao-search/
|
|
134
|
+
├── blocks.lance # Indexed code blocks with embeddings
|
|
135
|
+
└── metadata.lance # File metadata and statistics
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
Data persists across restarts. To reset, simply delete the `~/.ngao-search` directory.
|
|
139
|
+
|
|
140
|
+
## Development
|
|
141
|
+
|
|
142
|
+
```bash
|
|
143
|
+
# Build
|
|
144
|
+
npm run build
|
|
145
|
+
|
|
146
|
+
# Type check
|
|
147
|
+
npm run type-check
|
|
148
|
+
|
|
149
|
+
# Lint
|
|
150
|
+
npm run lint
|
|
151
|
+
npm run lint:fix
|
|
152
|
+
|
|
153
|
+
# Format
|
|
154
|
+
npm run format
|
|
155
|
+
npm run format:check
|
|
156
|
+
|
|
157
|
+
# Test
|
|
158
|
+
npm test
|
|
159
|
+
npm run test:watch
|
|
160
|
+
npm run test:coverage
|
|
161
|
+
```
|
|
162
|
+
|
|
163
|
+
## Supported File Types
|
|
164
|
+
|
|
165
|
+
| Format | Extensions | Features |
|
|
166
|
+
|--------|-----------|----------|
|
|
167
|
+
| Python | `.py`, `.pyi` | Functions, classes, methods, docstrings |
|
|
168
|
+
| Markdown | `.md`, `.markdown`, `.mdx` | Headings, paragraphs, code blocks |
|
|
169
|
+
| JavaScript | `.js`, `.jsx`, `.mjs` | Functions, classes, exports, hooks |
|
|
170
|
+
| TypeScript | `.ts`, `.tsx` | Types, interfaces, classes, functions |
|
|
171
|
+
| JSON | `.json`, `.jsonc` | Keys, nested structures |
|
|
172
|
+
| YAML | `.yaml`, `.yml` | Keys, values, nested structures |
|
|
173
|
+
|
|
174
|
+
## Algorithm Details
|
|
175
|
+
|
|
176
|
+
### Ranking Formula
|
|
177
|
+
|
|
178
|
+
```
|
|
179
|
+
Score = 0.35 × KeywordMatch + 0.25 × Position + 0.20 × ScopeSpecificity
|
|
180
|
+
+ 0.10 × Recency + 0.10 × Frequency
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
### Indexing
|
|
184
|
+
|
|
185
|
+
- **Inverted Index**: Maps keywords to source blocks
|
|
186
|
+
- **Block Registry**: Direct block access by ID
|
|
187
|
+
- **Scope Hierarchy**: Quick navigation by scope path
|
|
188
|
+
- **Metadata**: Change detection via file hashing
|
|
189
|
+
|
|
190
|
+
## API Reference
|
|
191
|
+
|
|
192
|
+
### ISearchEngine
|
|
193
|
+
|
|
194
|
+
```typescript
|
|
195
|
+
search(query: string, options?: SearchOptions): Promise<SearchResult[]>
|
|
196
|
+
indexDirectory(dirPath: string, options?: IndexOptions): Promise<IndexStats>
|
|
197
|
+
getStats(): Promise<IndexStats>
|
|
198
|
+
clearIndex(): Promise<void>
|
|
199
|
+
```
|
|
200
|
+
|
|
201
|
+
### IBlockRepository
|
|
202
|
+
|
|
203
|
+
```typescript
|
|
204
|
+
findById(id: string): Promise<Block | null>
|
|
205
|
+
findAll(): Promise<Block[]>
|
|
206
|
+
save(entity: Block): Promise<Block>
|
|
207
|
+
delete(id: string): Promise<boolean>
|
|
208
|
+
findByFilePath(filePath: string): Promise<Block[]>
|
|
209
|
+
searchByKeyword(keyword: string): Promise<Block[]>
|
|
210
|
+
```
|
|
211
|
+
|
|
212
|
+
## Performance
|
|
213
|
+
|
|
214
|
+
- Indexing: ~1000 blocks/second (single-threaded)
|
|
215
|
+
- Search: <100ms for typical queries
|
|
216
|
+
- Memory: ~1KB per indexed block (in-memory)
|
|
217
|
+
- Storage: ~500 bytes per block (SQLite)
|
|
218
|
+
|
|
219
|
+
## License
|
|
220
|
+
|
|
221
|
+
MIT
|
|
222
|
+
|
|
223
|
+
## Contributing
|
|
224
|
+
|
|
225
|
+
See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines.
|
|
226
|
+
|
|
227
|
+
## References
|
|
228
|
+
|
|
229
|
+
For detailed architecture and design documentation, see:
|
|
230
|
+
- `/docs/architecture-design-standards/` - Technical design
|
|
231
|
+
- `/docs/reference/` - Quick reference guides
|
|
232
|
+
- `/docs/tracking/` - Implementation roadmap
|