blink-query 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md ADDED
@@ -0,0 +1,51 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ ## [1.0.0] - 2026-02-14
6
+
7
+ ### Core
8
+ - DNS-inspired path resolution with 5 record types (SUMMARY, META, COLLECTION, SOURCE, ALIAS)
9
+ - Auto-COLLECTION generation when resolving namespace paths
10
+ - ALIAS chain resolution with loop detection (max 5 hops)
11
+ - TTL-based STALE detection on resolve
12
+ - SQLite storage via better-sqlite3, single-file embedded database
13
+
14
+ ### Query DSL
15
+ - PEG-based query language: `namespace where field = 'value' order by field limit N offset M`
16
+ - Operators: `=`, `!=`, `>`, `<`, `>=`, `<=`, `contains`, `in`
17
+ - Boolean logic: `AND`, `OR` with parenthesized grouping
18
+ - `since` shorthand for date filtering
19
+
20
+ ### Search
21
+ - FTS5 full-text search with porter stemming and unicode61 tokenization
22
+ - bm25 relevance ranking
23
+ - Namespace-scoped search with configurable limits
24
+
25
+ ### Ingestion
26
+ - Directory ingestion with basic fs walker (LlamaIndex optional)
27
+ - PostgreSQL ingestion: classic SQL, progressive batched, and table shorthand
28
+ - Web URL ingestion with HTML stripping
29
+ - Git repository ingestion via git CLI
30
+ - LLM-powered summarization and classification (OpenAI-compatible API)
31
+ - Preset derivers: FILESYSTEM, POSTGRES, WEB, GIT
32
+ - Extractive summarizer (no API key needed)
33
+
34
+ ### Validation
35
+ - Input validation layer (`src/validation.ts`) at save() boundary
36
+ - Namespace, title, TTL, content size (10MB), tags, RecordType validation
37
+ - PostgreSQL WHERE clause injection prevention
38
+ - Path traversal prevention
39
+
40
+ ### MCP Server
41
+ - 9 tools: resolve, save, search, list, query, get, delete, move, zones
42
+ - Runtime RecordType validation
43
+ - Input length limits on all string parameters
44
+ - Stdio transport for AI agent connectivity
45
+
46
+ ### CLI
47
+ - 9 commands: resolve, save, search, list, query, ingest, serve, zones, delete
48
+ - Commander-based with help text
49
+
50
+ ### Testing
51
+ - 269 unit tests, 37 integration tests (vitest)
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Arpit
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,359 @@
1
+ # Blink
2
+
3
+ **DNS-inspired knowledge resolution layer for AI agents**
4
+
5
+ [![Tests](https://img.shields.io/badge/tests-320%20passing-success)]() [![Node](https://img.shields.io/badge/node-%3E%3D20-brightgreen)]() [![License](https://img.shields.io/badge/license-MIT-blue)]() [![npm](https://img.shields.io/npm/v/blink-query)]()
6
+
7
+ Blink sits between your data and your AI agent. It turns documents from anywhere — files, databases, web pages, git repos — into typed knowledge records with DNS-like resolution semantics.
8
+
9
+ ```
10
+ Your Data → [Load → Store → Find] → Your Agent
11
+ ```
12
+
13
+ ---
14
+
15
+ ## Quick Start
16
+
17
+ ### Installation
18
+
19
+ ```bash
20
+ npm install blink-query
21
+
22
+ # Optional: for PDF/DOCX support
23
+ npm install llamaindex @llamaindex/readers
24
+
25
+ # Optional: for PostgreSQL ingestion
26
+ npm install pg
27
+ ```
28
+
29
+ ### Library API
30
+
31
+ ```typescript
32
+ import { Blink, extractiveSummarize } from 'blink-query';
33
+
34
+ const blink = new Blink();
35
+
36
+ // Ingest a directory
37
+ await blink.ingestDirectory('./docs', {
38
+ summarize: extractiveSummarize(500),
39
+ namespacePrefix: 'knowledge'
40
+ });
41
+
42
+ // Resolve knowledge
43
+ const response = blink.resolve('knowledge/readme');
44
+ console.log(response.record.summary);
45
+
46
+ // Query with DSL
47
+ const results = blink.query('knowledge where type = "SUMMARY" limit 5');
48
+
49
+ // Search
50
+ const found = blink.search('authentication jwt');
51
+
52
+ blink.close();
53
+ ```
54
+
55
+ ### CLI
56
+
57
+ ```bash
58
+ # Ingest files
59
+ blink ingest ./my-docs --prefix knowledge --tags "v1,docs"
60
+
61
+ # Resolve a path
62
+ blink resolve knowledge/readme
63
+
64
+ # Search
65
+ blink search "authentication api"
66
+
67
+ # Query with DSL
68
+ blink query 'knowledge where tags contains "urgent" order by hit_count desc'
69
+
70
+ # List records in a namespace
71
+ blink list knowledge --limit 20 --offset 0
72
+
73
+ # Manage zones
74
+ blink zones
75
+
76
+ # Move and delete
77
+ blink move knowledge/old-path knowledge/new-path
78
+ blink delete knowledge/outdated-doc
79
+ ```
80
+
81
+ All CLI commands support `--json` for machine-readable output and `--db` to target a specific database file.
82
+
83
+ ### MCP Server (for AI agents)
84
+
85
+ ```bash
86
+ blink serve
87
+ # AI agent connects via stdio MCP protocol
88
+ ```
89
+
90
+ 9 tools available: `blink_resolve`, `blink_save`, `blink_search`, `blink_list`, `blink_query`, `blink_get`, `blink_delete`, `blink_move`, `blink_zones`.
91
+
92
+ ### In-Memory Mode (for testing)
93
+
94
+ ```typescript
95
+ const blink = new Blink({ dbPath: ':memory:' });
96
+ ```
97
+
98
+ ---
99
+
100
+ ## The 5 Record Types
101
+
102
+ | Type | What it tells the agent | Example |
103
+ |------|------------------------|---------|
104
+ | **SUMMARY** | Read this directly, you have what you need | Project overview, meeting notes |
105
+ | **META** | Structured data, parse it | `{ status: "active", contributors: 12 }` |
106
+ | **COLLECTION** | Browse children, pick what's relevant | Table of contents, directory listings |
107
+ | **SOURCE** | Summary here, fetch source if you need depth | Large files, external APIs |
108
+ | **ALIAS** | Follow the redirect to the real record | Shortcuts, renames |
109
+
110
+ **Core innovation**: Type carries consumption instruction, content carries domain semantics.
111
+
112
+ ---
113
+
114
+ ## Data Sources
115
+
116
+ ### Directory Ingestion
117
+
118
+ ```typescript
119
+ await blink.ingestDirectory('./docs', {
120
+ summarize: extractiveSummarize(500),
121
+ namespacePrefix: 'docs',
122
+ maxFileSize: 1048576, // 1MB limit (default)
123
+ includeHidden: false, // skip dotfiles (default)
124
+ onProgress: ({ current, total, file }) => {
125
+ console.log(`${current}/${total}: ${file}`);
126
+ }
127
+ });
128
+ ```
129
+
130
+ Supports 50+ file extensions out of the box. Skips empty files, hidden files, and files over the size limit automatically.
131
+
132
+ ### PostgreSQL Ingestion
133
+
134
+ ```typescript
135
+ await blink.ingestFromPostgres({
136
+ connectionString: 'postgresql://localhost/mydb',
137
+ query: 'SELECT id, title, body FROM articles',
138
+ textColumn: 'body',
139
+ idColumn: 'id',
140
+ titleColumn: 'title'
141
+ });
142
+ ```
143
+
144
+ ### Web Ingestion
145
+
146
+ ```typescript
147
+ import { loadFromWeb } from 'blink-query';
148
+
149
+ const docs = await loadFromWeb([
150
+ 'https://example.com/docs/getting-started',
151
+ 'https://example.com/docs/api-reference'
152
+ ]);
153
+ await blink.ingest(docs, { namespacePrefix: 'web' });
154
+ ```
155
+
156
+ ### Git Ingestion
157
+
158
+ ```typescript
159
+ import { loadFromGit } from 'blink-query';
160
+
161
+ const docs = await loadFromGit({
162
+ repoPath: '/path/to/repo',
163
+ ref: 'main',
164
+ extensions: ['.md', '.ts']
165
+ });
166
+ await blink.ingest(docs, { namespacePrefix: 'repo' });
167
+ ```
168
+
169
+ ### LLM-Powered Summarization
170
+
171
+ ```typescript
172
+ import { Blink, configureLLM } from 'blink-query';
173
+
174
+ // Configure via environment variables:
175
+ // BLINK_LLM_PROVIDER=openai
176
+ // BLINK_LLM_MODEL=gpt-4o-mini
177
+ // OPENAI_API_KEY=...
178
+
179
+ const summarize = configureLLM();
180
+
181
+ await blink.ingestDirectory('./docs', {
182
+ summarize,
183
+ namespacePrefix: 'knowledge'
184
+ });
185
+ ```
186
+
187
+ Or bring your own summarizer:
188
+
189
+ ```typescript
190
+ await blink.ingest(docs, {
191
+ summarize: async (text, metadata) => {
192
+ // Call any LLM, return a string
193
+ return await myLLM.summarize(text);
194
+ }
195
+ });
196
+ ```
197
+
198
+ ---
199
+
200
+ ## Query DSL
201
+
202
+ SQL-like query language for filtering records:
203
+
204
+ ```
205
+ namespace where field op value [and|or condition] [order by field asc|desc] [limit N] [offset N] [since "date"]
206
+ ```
207
+
208
+ ### Examples
209
+
210
+ ```bash
211
+ # Filter by type
212
+ blink query 'docs where type = "SUMMARY"'
213
+
214
+ # Tag search
215
+ blink query 'projects where tags contains "urgent" order by hit_count desc'
216
+
217
+ # Boolean logic
218
+ blink query 'docs where type = "SOURCE" and hit_count > 10'
219
+
220
+ # NOT operator
221
+ blink query 'docs where not type = "ALIAS"'
222
+
223
+ # IN operator
224
+ blink query 'docs where type in ("SUMMARY", "META")'
225
+
226
+ # Pagination
227
+ blink query 'docs where type = "SUMMARY" limit 10 offset 20'
228
+
229
+ # Date filtering
230
+ blink query 'docs since "2026-01-01"'
231
+ ```
232
+
233
+ ---
234
+
235
+ ## Resolution
236
+
237
+ ```typescript
238
+ const response = blink.resolve('projects/orpheus/readme');
239
+
240
+ switch (response.status) {
241
+ case 'OK': // Record found
242
+ case 'STALE': // Record found but TTL expired
243
+ case 'NXDOMAIN': // Not found
244
+ case 'ALIAS_LOOP': // Circular alias detected
245
+ }
246
+ ```
247
+
248
+ Resolution follows DNS semantics:
249
+ - Direct path lookup
250
+ - ALIAS chains (up to 5 hops)
251
+ - Auto-COLLECTION: resolving a namespace generates a listing of child records
252
+ - TTL expiry: records past their TTL return with STALE status
253
+
254
+ ---
255
+
256
+ ## API Design
257
+
258
+ All CRUD operations are **synchronous** — no `await` needed:
259
+
260
+ | Method | Returns | Description |
261
+ |--------|---------|-------------|
262
+ | `resolve(path)` | `{ status, record }` | DNS-like path resolution |
263
+ | `get(path)` | `record \| null` | Direct lookup |
264
+ | `save(input)` | `record` | Create or update |
265
+ | `delete(path)` | `boolean` | Remove a record |
266
+ | `move(from, to)` | `record \| null` | Move/rename |
267
+ | `search(query)` | `record[]` | FTS5 keyword search |
268
+ | `list(namespace)` | `record[]` | List records in namespace |
269
+ | `query(dsl)` | `record[]` | Query DSL filtering |
270
+
271
+ Only ingestion methods (`ingest`, `ingestDirectory`, `ingestFromPostgres`) are async.
272
+
273
+ ### Error Handling
274
+
275
+ - `resolve()` returns a status object — check `status` before using `record`
276
+ - `get()` returns `null` if the path doesn't exist
277
+ - `delete()` returns `false` if the record wasn't found
278
+ - `move()` returns `null` if the source doesn't exist
279
+ - `query()` throws on invalid query syntax
280
+ - `save()` throws on invalid input (e.g., ALIAS without target)
281
+
282
+ ---
283
+
284
+ ## Input Validation
285
+
286
+ All input is validated at the save boundary:
287
+
288
+ - Namespaces: no path traversal (`..`), no special characters (`#`, `?`, `%`)
289
+ - Titles: non-empty, trimmed
290
+ - Content: max 10MB
291
+ - Tags: deduplicated, cleaned
292
+ - Record types: must be one of the 5 valid types
293
+ - PostgreSQL WHERE clauses: checked for injection patterns
294
+
295
+ ---
296
+
297
+ ## Architecture
298
+
299
+ ```
300
+ ┌─────────────────────────────────────────────────────────────┐
301
+ │ Blink System │
302
+ ├─────────────┬───────────────┬──────────────┬───────────────┤
303
+ │ Ingestion │ Storage │ Resolution │ Consumption │
304
+ │ │ │ │ │
305
+ │ Directory │ SQLite DB │ Resolver │ Library │
306
+ │ PostgreSQL │ FTS5 Search │ Query DSL │ CLI │
307
+ │ Web / Git │ Transactions │ Auto-COLL │ MCP Server │
308
+ │ LLM Summary│ Zones │ ALIAS chain │ JSON output │
309
+ └─────────────┴───────────────┴──────────────┴───────────────┘
310
+ ```
311
+
312
+ See [docs/ARCHITECTURE.md](./docs/ARCHITECTURE.md) for a full plain-language guide.
313
+
314
+ ---
315
+
316
+ ## Development
317
+
318
+ ```bash
319
+ # Install dependencies
320
+ npm install
321
+
322
+ # Build (parser + library + CLI)
323
+ npm run build
324
+
325
+ # Run tests (excludes integration tests)
326
+ npm test
327
+
328
+ # Run all tests (including PostgreSQL integration)
329
+ npm run test:all
330
+
331
+ # Build PEG parser only
332
+ npm run build:parser
333
+
334
+ # Pack for inspection before publishing
335
+ npm pack --dry-run
336
+
337
+ # CLI (dev mode)
338
+ node dist/index.js --help
339
+ ```
340
+
341
+ ---
342
+
343
+ ## Use Cases
344
+
345
+ - **Agent memory** — Store conversation context with semantic types
346
+ - **Project knowledge base** — Ingest codebases, docs, wikis
347
+ - **API caching** — Cache API responses with TTL
348
+ - **Research notes** — Structure knowledge with namespaces
349
+ - **Configuration** — Store settings as META records
350
+
351
+ ---
352
+
353
+ ## License
354
+
355
+ MIT — see [LICENSE](./LICENSE)
356
+
357
+ ---
358
+
359
+ **Questions?** [Open an issue](https://github.com/arpitnath/blink-query/issues) or read the [architecture docs](./docs/ARCHITECTURE.md).