@ngao/search 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,219 @@
1
+ # ngao search
2
+
3
+ A Model Context Protocol (MCP) server for local code and document search with LLM-friendly output.
4
+
5
+ ## Features
6
+
7
+ - 🔍 **Multi-Format Indexing**: Python, Markdown, JavaScript/TypeScript, JSON, YAML
8
+ - 🧠 **LLM-Optimized**: Results formatted specifically for AI model consumption
9
+ - 📊 **Smart Ranking**: Multi-factor relevance scoring
10
+ - 🚀 **Fast Local Search**: No network dependency, instant results
11
+ - 📚 **Context-Aware**: Includes surrounding code context in results
12
+ - 🔄 **Incremental Indexing**: Only reindexes changed files
13
+ - 💾 **Persistent Storage**: LanceDB integration for data persistence across restarts
14
+ - 🤖 **Vector Search**: Semantic similarity search with embeddings support
15
+
16
+ ## Installation
17
+
18
+ ```bash
19
+ npm install
20
+ npm run build
21
+ ```
22
+
23
+ ## Usage
24
+
25
+ ### As a Library
26
+
27
+ ```typescript
28
+ import { SearchEngine } from 'ngao-search-mcp';
29
+ import {
30
+ InMemoryBlockRepository,
31
+ InMemoryMetadataRepository,
32
+ } from 'ngao-search-mcp';
33
+
34
+ // Create repositories
35
+ const blockRepo = new InMemoryBlockRepository();
36
+ const metadataRepo = new InMemoryMetadataRepository();
37
+
38
+ // Create search engine
39
+ const engine = new SearchEngine(blockRepo, metadataRepo);
40
+
41
+ // Index a directory
42
+ const stats = await engine.indexDirectory('./src');
43
+ console.log(`Indexed ${stats.totalBlocks} blocks from ${stats.totalFiles} files`);
44
+
45
+ // Search
46
+ const results = await engine.search('authentication handler');
47
+ console.log(`Found ${results.length} results`);
48
+ ```
49
+
50
+ ### As MCP Server
51
+
52
+ ```typescript
53
+ import { McpServer, handleToolCall } from 'ngao-search-mcp';
54
+
55
+ const server = new McpServer();
56
+
57
+ // Get available tools
58
+ const tools = server.getTools();
59
+
60
+ // Handle tool calls
61
+ const result = await handleToolCall(server, 'search', {
62
+ query: 'authentication',
63
+ maxResults: 10,
64
+ });
65
+ ```
66
+
67
+ ## Architecture
68
+
69
+ ### Core Components
70
+
71
+ - **SearchEngine**: Main facade implementing ISearchEngine interface
72
+ - **Parsers**: File type-specific parsers (Python, Markdown, JavaScript, etc.)
73
+ - **Repositories**: Abstract data access layer
74
+ - **Indexing**: Block extraction and index building
75
+ - **Search**: Query parsing, ranking, and result formatting
76
+ - **MCP Server**: Protocol implementation for tool exposure
77
+
78
+ ### Design Patterns
79
+
80
+ - **Repository Pattern**: Flexible storage backend (in-memory for tests, SQLite for production)
81
+ - **Facade Pattern**: Single entry point via SearchEngine
82
+ - **Factory Pattern**: Parser selection by file type
83
+ - **Dependency Injection**: Explicit dependency management
84
+
85
+ ## Project Structure
86
+
87
+ ```
88
+ src/
89
+ ├── core/ # Models, types, constants, errors
90
+ ├── parsers/ # File type parsers
91
+ ├── repositories/ # Data access interfaces & implementations
92
+ ├── indexing/ # Block extraction & index building
93
+ ├── search/ # Query, ranking, formatting
94
+ ├── api/ # SearchEngine facade
95
+ ├── mcp/ # MCP server implementation
96
+ └── utils/ # Utility functions
97
+
98
+ tests/
99
+ ├── unit/ # Unit tests
100
+ └── integration/ # Integration tests
101
+ ```
102
+
103
+ ## Coding Standards
104
+
105
+ All code follows strict TypeScript conventions:
106
+
107
+ - ✅ **Type Safety**: No `any` types, strict mode enabled
108
+ - ✅ **Naming**: PascalCase classes, camelCase methods, UPPER_SNAKE_CASE constants
109
+ - ✅ **Architecture**: Clear separation of concerns
110
+ - ✅ **Documentation**: JSDoc comments for public APIs
111
+ - ✅ **Testing**: Unit and integration tests
112
+
113
+ ## Configuration
114
+
115
+ Environment variables:
116
+
117
+ ```bash
118
+ PORT=3000 # REST API port (optional, defaults to random available port)
119
+ ```
120
+
121
+ ### Storage
122
+
123
+ ngao search uses **LanceDB** for persistent storage:
124
+
125
+ ```bash
126
+ ~/.ngao-search/
127
+ ├── blocks.lance # Indexed code blocks with embeddings
128
+ └── metadata.lance # File metadata and statistics
129
+ ```
130
+
131
+ Data persists across restarts. To reset, simply delete the `~/.ngao-search` directory.
132
+
133
+ ## Development
134
+
135
+ ```bash
136
+ # Build
137
+ npm run build
138
+
139
+ # Type check
140
+ npm run type-check
141
+
142
+ # Lint
143
+ npm run lint
144
+ npm run lint:fix
145
+
146
+ # Format
147
+ npm run format
148
+ npm run format:check
149
+
150
+ # Test
151
+ npm test
152
+ npm run test:watch
153
+ npm run test:coverage
154
+ ```
155
+
156
+ ## Supported File Types
157
+
158
+ | Format | Extensions | Features |
159
+ |--------|-----------|----------|
160
+ | Python | `.py`, `.pyi` | Functions, classes, methods, docstrings |
161
+ | Markdown | `.md`, `.markdown`, `.mdx` | Headings, paragraphs, code blocks |
162
+ | JavaScript | `.js`, `.jsx`, `.mjs` | Functions, classes, exports, hooks |
163
+ | TypeScript | `.ts`, `.tsx` | Types, interfaces, classes, functions |
164
+ | JSON | `.json`, `.jsonc` | Keys, nested structures |
165
+ | YAML | `.yaml`, `.yml` | Keys, values, nested structures |
166
+
167
+ ## Algorithm Details
168
+
169
+ ### Ranking Formula
170
+
171
+ ```
172
+ Score = 0.35 × KeywordMatch + 0.25 × Position + 0.20 × ScopeSpecificity
173
+ + 0.10 × Recency + 0.10 × Frequency
174
+ ```
175
+
176
+ ### Indexing
177
+
178
+ - **Inverted Index**: Maps keywords to source blocks
179
+ - **Block Registry**: Direct block access by ID
180
+ - **Scope Hierarchy**: Quick navigation by scope path
181
+ - **Metadata**: Change detection via file hashing
182
+
183
+ ## API Reference
184
+
185
+ ### ISearchEngine
186
+
187
+ ```typescript
188
+ search(query: string, options?: SearchOptions): Promise<SearchResult[]>
189
+ indexDirectory(dirPath: string, options?: IndexOptions): Promise<IndexStats>
190
+ getStats(): Promise<IndexStats>
191
+ clearIndex(): Promise<void>
192
+ ```
193
+
194
+ ### IBlockRepository
195
+
196
+ ```typescript
197
+ findById(id: string): Promise<Block | null>
198
+ findAll(): Promise<Block[]>
199
+ save(entity: Block): Promise<Block>
200
+ delete(id: string): Promise<boolean>
201
+ findByFilePath(filePath: string): Promise<Block[]>
202
+ searchByKeyword(keyword: string): Promise<Block[]>
203
+ ```
204
+
205
+ ## Performance
206
+
207
+ - Indexing: ~1000 blocks/second (single-threaded)
208
+ - Search: <100ms for typical queries
209
+ - Memory: ~1KB per indexed block (in-memory)
210
+ - Storage: ~500 bytes per block (SQLite)
211
+
212
+ ## License
213
+
214
+ MIT
215
+
216
+ ## Contributing
217
+
218
+ Contributions are welcome! Please feel free to submit issues and pull requests.
219
+
@@ -0,0 +1,9 @@
1
+ #!/usr/bin/env node
2
+ /**
3
+ * Wrapper entry point for ngao search CLI
4
+ * This allows npx @ngaodev/search --help to work without the -- separator
5
+ * It explicitly passes all arguments to the main entry point
6
+ */
7
+
8
+ // Pass all arguments directly to the bundled main.js
9
+ require('../dist/main.js');