@tryformation/querylight-cli 0.1.1 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -26,6 +26,8 @@ Run without installing globally:
26
26
  bunx @tryformation/querylight-cli init
27
27
  ```
28
28
 
29
+ For agent and Python automation examples that use `bunx` and `uv`, see [`examples/skills/qli-bunx-uv/SKILL.md`](https://github.com/formation-res/querylight-cli/blob/main/examples/skills/qli-bunx-uv/SKILL.md).
30
+
29
31
  Install as a dependency:
30
32
 
31
33
  ```bash
@@ -38,6 +40,12 @@ Then run:
38
40
  npx qli --help
39
41
  ```
40
42
 
43
+ If you prefer to avoid a local install, use:
44
+
45
+ ```bash
46
+ bunx @tryformation/querylight-cli --help
47
+ ```
48
+
41
49
  ## Release
42
50
 
43
51
  Publish releases from semantic version tags such as `0.1.1`.
@@ -78,6 +86,8 @@ Initialize a workspace:
78
86
  qli init
79
87
  ```
80
88
 
89
+ `qli init` creates the workspace config, enables dense and sparse retrieval for new workspaces, and pulls missing model assets when the runtime is available.
90
+
81
91
  Add a local docs directory:
82
92
 
83
93
  ```bash
@@ -87,7 +97,7 @@ qli source add directory ./docs --name "Local Docs" --tag docs
87
97
  Build the knowledge base:
88
98
 
89
99
  ```bash
90
- qli rebuild
100
+ qli ingest
91
101
  ```
92
102
 
93
103
  Search it:
@@ -109,6 +119,18 @@ Generate retrieval context:
109
119
  qli context "How do I authenticate API requests?" --top-k 8
110
120
  ```
111
121
 
122
+ ## Example Skill: `qli` with `bunx` and `uv`
123
+
124
+ The repository includes an example skill for running `qli` without a global install and calling it from Python with `uv`:
125
+
126
+ - [`examples/skills/qli-bunx-uv/SKILL.md`](https://github.com/formation-res/querylight-cli/blob/main/examples/skills/qli-bunx-uv/SKILL.md)
127
+
128
+ It covers:
129
+
130
+ - running `qli` with `bunx @tryformation/querylight-cli`
131
+ - using `--json` for automation and agents
132
+ - calling `qli search` and `qli context` from Python with `subprocess`
133
+
112
134
  ## Example: Index `querylight.tryformation.com`
113
135
 
114
136
  This example uses a local linked build of `qli` to create a test knowledge base for the Querylight documentation website.
@@ -146,10 +168,13 @@ qli source add website https://querylight.tryformation.com \
146
168
  --tag docs
147
169
  ```
148
170
 
149
- 5. Build the local index:
171
+ `qli source add website` may also detect one blog or news feed and register it as a separate `rss` source. Use `--json` when another tool needs the full list of created sources.
172
+ Use `qli source add page` for one page. Use `qli source add website` when you want crawling or feed detection.
173
+
174
+ 5. Ingest content and refresh the local index:
150
175
 
151
176
  ```bash
152
- qli rebuild
177
+ qli ingest
153
178
  ```
154
179
 
155
180
  6. Inspect and query the result:
@@ -189,19 +214,30 @@ The default workspace is `.kb/`.
189
214
  logs/
190
215
  ```
191
216
 
217
+ Vector model downloads are shared across workspaces under `~/.qli/models/` by default. `qli init` pulls missing model assets for enabled retrieval modes, so a new workspace is ready for vector indexing after setup.
218
+
192
219
  Use a custom workspace with:
193
220
 
194
221
  ```bash
195
222
  qli --workspace ./my-kb <command>
196
223
  ```
197
224
 
225
+ Control the default remote concurrency in `config.yaml`:
226
+
227
+ ```yaml
228
+ crawler:
229
+ maxConcurrentRequests: 5
230
+ ```
231
+
232
+ Set `crawl.maxConcurrentRequests` on a website or RSS source when one source needs a different limit.
233
+
198
234
  ## Supported Sources
199
235
 
200
236
  Current source types:
201
237
 
202
238
  - `file`
203
239
  - `directory`
204
- - `url`
240
+ - `page`
205
241
  - `website`
206
242
  - `rss`
207
243
  - `markdown`
@@ -224,10 +260,12 @@ All commands support:
224
260
  --workspace <path>
225
261
  --config <path>
226
262
  --json
263
+ --silent
227
264
  --verbose
228
- --quiet
229
265
  ```
230
266
 
267
+ Long-running commands print progress to stderr by default. Use `--silent` to suppress progress output. Use `--json` when another tool needs stable structured output.
268
+
231
269
  ### Initialize
232
270
 
233
271
  ```bash
@@ -243,16 +281,25 @@ Add sources:
243
281
  ```bash
244
282
  qli source add file ./docs/guide.md --name "Guide"
245
283
  qli source add directory ./docs --name "Docs" --tag docs
246
- qli source add url https://example.com/docs/auth --name "Auth Page"
284
+ qli source add page https://example.com/docs/auth --name "Auth Page"
247
285
  qli source add website https://example.com --name "Example Site" --max-depth 2 --max-pages 50
286
+ qli source add website https://example.com --name "Example Site" --max-concurrent-requests 8
287
+ qli source add website https://example.com --name "Example Site" --json
248
288
  qli source add rss https://example.com/feed.xml --name "Release Feed"
289
+ qli source add rss https://example.com/feed.xml --name "Release Feed" --max-concurrent-requests 3
249
290
  ```
250
291
 
292
+ `page` stores one page. `website` crawls a site and may detect one feed during registration.
293
+
294
+ Website sources may detect one blog or news feed during registration. When qli can infer a shared article prefix such as `/blog/` or `/news/`, it adds that prefix to the website source excludes to reduce duplicate ingestion.
295
+ Website and RSS sources default to `5` remote requests in flight per source. Override that in `config.yaml` or on the source.
296
+
251
297
  List and manage them:
252
298
 
253
299
  ```bash
254
300
  qli source list
255
301
  qli source config <source-id> --retention-days 30
302
+ qli source config <source-id> --max-concurrent-requests 2
256
303
  qli source config <source-id> --name "Docs Feed" --tag rss docs
257
304
  qli source disable <source-id>
258
305
  qli source enable <source-id>
@@ -274,6 +321,8 @@ Or pull every model that is available on the current machine:
274
321
  qli models pull
275
322
  ```
276
323
 
324
+ By default, `qli models pull` stores model assets in `~/.qli/models/` so multiple workspaces can reuse them.
325
+
277
326
  Then ask for documents related to an existing document id or URI:
278
327
 
279
328
  ```bash
@@ -287,9 +336,13 @@ qli related https://example.com/docs/auth
287
336
  qli ingest
288
337
  qli chunk
289
338
  qli index build
339
+ qli rebuild --silent
290
340
  ```
291
341
 
292
- Run the full pipeline:
342
+ `qli ingest` fetches source content, updates affected chunks, and refreshes the index.
343
+ Remote website and RSS fetches run concurrently. By default qli allows `5` in-flight requests per source.
344
+
345
+ Use `qli rebuild` when you want the explicit full pipeline command:
293
346
 
294
347
  ```bash
295
348
  qli rebuild
@@ -304,9 +357,9 @@ Search:
304
357
  ```bash
305
358
  qli search "pricing API limits"
306
359
  qli search "refund policy" --tag support --top-k 20
307
- qli search --source-type rss,url --since 2026-05-01 --has-publication-date --top-k 25
360
+ qli search --source-type rss,page --since 2026-05-01 --has-publication-date --top-k 25
308
361
  qli search --source-name "Release Feed,Company Blog" --uri-prefix https://example.com/news,https://example.com/blog
309
- qli search --source-type rss,url --top-k 25 --json
362
+ qli search --source-type rss,page --top-k 25 --json
310
363
  qli search "authentication" --json
311
364
  ```
312
365
 
@@ -1,9 +1,11 @@
1
+ import { type ProgressHandler } from "../core/progress.js";
1
2
  import type { ChunkRecord, DocumentRecord, WorkspaceConfig } from "../types/models.js";
2
3
  export declare function buildChunksForDocument(document: DocumentRecord, markdown: string, config: WorkspaceConfig, prior?: Map<string, ChunkRecord>, seenAt?: string): ChunkRecord[];
3
- export declare function chunkDocuments({ workspacePath, sourceId, documentId }: {
4
+ export declare function chunkDocuments({ workspacePath, sourceId, documentId, progress }: {
4
5
  workspacePath: string;
5
6
  sourceId?: string;
6
7
  documentId?: string;
8
+ progress?: ProgressHandler;
7
9
  }): Promise<{
8
10
  chunksWritten: number;
9
11
  }>;