@tryformation/querylight-cli 0.1.1 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +62 -9
- package/dist/chunk/chunker.d.ts +3 -1
- package/dist/cli/main.js +1031 -237
- package/dist/cli/run-cli.d.ts +4 -1
- package/dist/core/concurrency.d.ts +1 -0
- package/dist/core/constants.d.ts +3 -1
- package/dist/core/progress.d.ts +4 -0
- package/dist/core/urls.d.ts +1 -0
- package/dist/index/querylight-indexer.d.ts +3 -1
- package/dist/index.js +441 -114
- package/dist/ingest/adapters/website-adapter.d.ts +6 -1
- package/dist/ingest/adapters/website-feed-discovery.d.ts +6 -0
- package/dist/ingest/extractors/html-extractor.d.ts +1 -0
- package/dist/ingest/ingest-service.d.ts +5 -2
- package/dist/types/models.d.ts +2 -2
- package/dist/vector/dense.d.ts +3 -1
- package/dist/vector/runtime.d.ts +2 -0
- package/dist/vector/service.d.ts +20 -2
- package/dist/vector/sparse.d.ts +3 -1
- package/dist/vector/store.d.ts +8 -2
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -26,6 +26,8 @@ Run without installing globally:
|
|
|
26
26
|
bunx @tryformation/querylight-cli init
|
|
27
27
|
```
|
|
28
28
|
|
|
29
|
+
For agent and Python automation examples that use `bunx` and `uv`, see [`examples/skills/qli-bunx-uv/SKILL.md`](https://github.com/formation-res/querylight-cli/blob/main/examples/skills/qli-bunx-uv/SKILL.md).
|
|
30
|
+
|
|
29
31
|
Install as a dependency:
|
|
30
32
|
|
|
31
33
|
```bash
|
|
@@ -38,6 +40,12 @@ Then run:
|
|
|
38
40
|
npx qli --help
|
|
39
41
|
```
|
|
40
42
|
|
|
43
|
+
If you prefer to avoid a local install, use:
|
|
44
|
+
|
|
45
|
+
```bash
|
|
46
|
+
bunx @tryformation/querylight-cli --help
|
|
47
|
+
```
|
|
48
|
+
|
|
41
49
|
## Release
|
|
42
50
|
|
|
43
51
|
Publish releases from semantic version tags such as `0.1.1`.
|
|
@@ -78,6 +86,8 @@ Initialize a workspace:
|
|
|
78
86
|
qli init
|
|
79
87
|
```
|
|
80
88
|
|
|
89
|
+
`qli init` creates the workspace config, enables dense and sparse retrieval for new workspaces, and pulls missing model assets when the runtime is available.
|
|
90
|
+
|
|
81
91
|
Add a local docs directory:
|
|
82
92
|
|
|
83
93
|
```bash
|
|
@@ -87,7 +97,7 @@ qli source add directory ./docs --name "Local Docs" --tag docs
|
|
|
87
97
|
Build the knowledge base:
|
|
88
98
|
|
|
89
99
|
```bash
|
|
90
|
-
qli
|
|
100
|
+
qli ingest
|
|
91
101
|
```
|
|
92
102
|
|
|
93
103
|
Search it:
|
|
@@ -109,6 +119,18 @@ Generate retrieval context:
|
|
|
109
119
|
qli context "How do I authenticate API requests?" --top-k 8
|
|
110
120
|
```
|
|
111
121
|
|
|
122
|
+
## Example Skill: `qli` with `bunx` and `uv`
|
|
123
|
+
|
|
124
|
+
The repository includes an example skill for running `qli` without a global install and calling it from Python with `uv`:
|
|
125
|
+
|
|
126
|
+
- [`examples/skills/qli-bunx-uv/SKILL.md`](https://github.com/formation-res/querylight-cli/blob/main/examples/skills/qli-bunx-uv/SKILL.md)
|
|
127
|
+
|
|
128
|
+
It covers:
|
|
129
|
+
|
|
130
|
+
- running `qli` with `bunx @tryformation/querylight-cli`
|
|
131
|
+
- using `--json` for automation and agents
|
|
132
|
+
- calling `qli search` and `qli context` from Python with `subprocess`
|
|
133
|
+
|
|
112
134
|
## Example: Index `querylight.tryformation.com`
|
|
113
135
|
|
|
114
136
|
This example uses a local linked build of `qli` to create a test knowledge base for the Querylight documentation website.
|
|
@@ -146,10 +168,13 @@ qli source add website https://querylight.tryformation.com \
|
|
|
146
168
|
--tag docs
|
|
147
169
|
```
|
|
148
170
|
|
|
149
|
-
|
|
171
|
+
`qli source add website` may also detect one blog or news feed and register it as a separate `rss` source. Use `--json` when another tool needs the full list of created sources.
|
|
172
|
+
Use `qli source add page` for one page. Use `qli source add website` when you want crawling or feed detection.
|
|
173
|
+
|
|
174
|
+
5. Ingest content and refresh the local index:
|
|
150
175
|
|
|
151
176
|
```bash
|
|
152
|
-
qli
|
|
177
|
+
qli ingest
|
|
153
178
|
```
|
|
154
179
|
|
|
155
180
|
6. Inspect and query the result:
|
|
@@ -189,19 +214,30 @@ The default workspace is `.kb/`.
|
|
|
189
214
|
logs/
|
|
190
215
|
```
|
|
191
216
|
|
|
217
|
+
Vector model downloads are shared across workspaces under `~/.qli/models/` by default. `qli init` pulls missing model assets for enabled retrieval modes, so a new workspace is ready for vector indexing after setup.
|
|
218
|
+
|
|
192
219
|
Use a custom workspace with:
|
|
193
220
|
|
|
194
221
|
```bash
|
|
195
222
|
qli --workspace ./my-kb <command>
|
|
196
223
|
```
|
|
197
224
|
|
|
225
|
+
Control the default remote concurrency in `config.yaml`:
|
|
226
|
+
|
|
227
|
+
```yaml
|
|
228
|
+
crawler:
|
|
229
|
+
maxConcurrentRequests: 5
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
Set `crawl.maxConcurrentRequests` on a website or RSS source when one source needs a different limit.
|
|
233
|
+
|
|
198
234
|
## Supported Sources
|
|
199
235
|
|
|
200
236
|
Current source types:
|
|
201
237
|
|
|
202
238
|
- `file`
|
|
203
239
|
- `directory`
|
|
204
|
-
- `
|
|
240
|
+
- `page`
|
|
205
241
|
- `website`
|
|
206
242
|
- `rss`
|
|
207
243
|
- `markdown`
|
|
@@ -224,10 +260,12 @@ All commands support:
|
|
|
224
260
|
--workspace <path>
|
|
225
261
|
--config <path>
|
|
226
262
|
--json
|
|
263
|
+
--silent
|
|
227
264
|
--verbose
|
|
228
|
-
--quiet
|
|
229
265
|
```
|
|
230
266
|
|
|
267
|
+
Long-running commands print progress to stderr by default. Use `--silent` to suppress progress output. Use `--json` when another tool needs stable structured output.
|
|
268
|
+
|
|
231
269
|
### Initialize
|
|
232
270
|
|
|
233
271
|
```bash
|
|
@@ -243,16 +281,25 @@ Add sources:
|
|
|
243
281
|
```bash
|
|
244
282
|
qli source add file ./docs/guide.md --name "Guide"
|
|
245
283
|
qli source add directory ./docs --name "Docs" --tag docs
|
|
246
|
-
qli source add
|
|
284
|
+
qli source add page https://example.com/docs/auth --name "Auth Page"
|
|
247
285
|
qli source add website https://example.com --name "Example Site" --max-depth 2 --max-pages 50
|
|
286
|
+
qli source add website https://example.com --name "Example Site" --max-concurrent-requests 8
|
|
287
|
+
qli source add website https://example.com --name "Example Site" --json
|
|
248
288
|
qli source add rss https://example.com/feed.xml --name "Release Feed"
|
|
289
|
+
qli source add rss https://example.com/feed.xml --name "Release Feed" --max-concurrent-requests 3
|
|
249
290
|
```
|
|
250
291
|
|
|
292
|
+
`page` stores one page. `website` crawls a site and may detect one feed during registration.
|
|
293
|
+
|
|
294
|
+
Website sources may detect one blog or news feed during registration. When qli can infer a shared article prefix such as `/blog/` or `/news/`, it adds that prefix to the website source excludes to reduce duplicate ingestion.
|
|
295
|
+
Website and RSS sources default to `5` remote requests in flight per source. Override that in `config.yaml` or on the source.
|
|
296
|
+
|
|
251
297
|
List and manage them:
|
|
252
298
|
|
|
253
299
|
```bash
|
|
254
300
|
qli source list
|
|
255
301
|
qli source config <source-id> --retention-days 30
|
|
302
|
+
qli source config <source-id> --max-concurrent-requests 2
|
|
256
303
|
qli source config <source-id> --name "Docs Feed" --tag rss docs
|
|
257
304
|
qli source disable <source-id>
|
|
258
305
|
qli source enable <source-id>
|
|
@@ -274,6 +321,8 @@ Or pull every model that is available on the current machine:
|
|
|
274
321
|
qli models pull
|
|
275
322
|
```
|
|
276
323
|
|
|
324
|
+
By default, `qli models pull` stores model assets in `~/.qli/models/` so multiple workspaces can reuse them.
|
|
325
|
+
|
|
277
326
|
Then ask for documents related to an existing document id or URI:
|
|
278
327
|
|
|
279
328
|
```bash
|
|
@@ -287,9 +336,13 @@ qli related https://example.com/docs/auth
|
|
|
287
336
|
qli ingest
|
|
288
337
|
qli chunk
|
|
289
338
|
qli index build
|
|
339
|
+
qli rebuild --silent
|
|
290
340
|
```
|
|
291
341
|
|
|
292
|
-
|
|
342
|
+
`qli ingest` fetches source content, updates affected chunks, and refreshes the index.
|
|
343
|
+
Remote website and RSS fetches run concurrently. By default qli allows `5` in-flight requests per source.
|
|
344
|
+
|
|
345
|
+
Use `qli rebuild` when you want the explicit full pipeline command:
|
|
293
346
|
|
|
294
347
|
```bash
|
|
295
348
|
qli rebuild
|
|
@@ -304,9 +357,9 @@ Search:
|
|
|
304
357
|
```bash
|
|
305
358
|
qli search "pricing API limits"
|
|
306
359
|
qli search "refund policy" --tag support --top-k 20
|
|
307
|
-
qli search --source-type rss,
|
|
360
|
+
qli search --source-type rss,page --since 2026-05-01 --has-publication-date --top-k 25
|
|
308
361
|
qli search --source-name "Release Feed,Company Blog" --uri-prefix https://example.com/news,https://example.com/blog
|
|
309
|
-
qli search --source-type rss,
|
|
362
|
+
qli search --source-type rss,page --top-k 25 --json
|
|
310
363
|
qli search "authentication" --json
|
|
311
364
|
```
|
|
312
365
|
|
package/dist/chunk/chunker.d.ts
CHANGED
|
@@ -1,9 +1,11 @@
|
|
|
1
|
+
import { type ProgressHandler } from "../core/progress.js";
|
|
1
2
|
import type { ChunkRecord, DocumentRecord, WorkspaceConfig } from "../types/models.js";
|
|
2
3
|
export declare function buildChunksForDocument(document: DocumentRecord, markdown: string, config: WorkspaceConfig, prior?: Map<string, ChunkRecord>, seenAt?: string): ChunkRecord[];
|
|
3
|
-
export declare function chunkDocuments({ workspacePath, sourceId, documentId }: {
|
|
4
|
+
export declare function chunkDocuments({ workspacePath, sourceId, documentId, progress }: {
|
|
4
5
|
workspacePath: string;
|
|
5
6
|
sourceId?: string;
|
|
6
7
|
documentId?: string;
|
|
8
|
+
progress?: ProgressHandler;
|
|
7
9
|
}): Promise<{
|
|
8
10
|
chunksWritten: number;
|
|
9
11
|
}>;
|