aiex-cli 0.0.5-beta.6 → 0.0.6-beta.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +1 -11
- package/dist/cli.mjs +322 -951
- package/dist/{doctor-collector-BpqhXNcO.mjs → doctor-collector-CGo5dgHm.mjs} +70 -52
- package/dist/index.d.mts +88 -91
- package/dist/index.mjs +1 -1
- package/dist/web/assets/AISettings-Dbma0Oku.js +264 -0
- package/dist/web/assets/{DataBrowser-BGkZb9FV.js → DataBrowser-GAA-pGq0.js} +1 -1
- package/dist/web/assets/ExtractionViewer-CrQMLtX7.js +1 -0
- package/dist/web/assets/{api-client-gQAAOw0v.js → api-client-b4ZBXpNH.js} +1 -1
- package/dist/web/assets/{index-BQKZKzzP.js → index-CdQgz6dJ.js} +8 -8
- package/dist/web/assets/index-D0So2rJE.css +2 -0
- package/dist/web/index.html +3 -3
- package/dist/{zh-CN-DkillGHx.mjs → zh-CN-wEUNhuHM.mjs} +18 -18
- package/package.json +2 -3
- package/dist/web/assets/AISettings-sVI4PTNB.js +0 -264
- package/dist/web/assets/ExtractionViewer-DNrkSECj.js +0 -1
- package/dist/web/assets/index-BU58oIRd.css +0 -2
package/README.md
CHANGED
|
@@ -202,23 +202,13 @@ aiex completion fish | source
|
|
|
202
202
|
|
|
203
203
|
<br>
|
|
204
204
|
|
|
205
|
-
## 📄 Large Document Processing
|
|
206
|
-
|
|
207
|
-
`aiex` uses a unified text extraction pipeline for both short and very large documents. Source files are converted to text or Markdown first; images are converted with OCR before structured extraction.
|
|
208
|
-
|
|
209
|
-
- **Token-Aware AST Splitting**: Parses structural Markdown elements (headings, paragraphs, lists) using an AST-based parser (`marked.lexer`) and splits them using precise token counters (`js-tiktoken`). Active heading hierarchies are tracked and prepended to each chunk as context. Short documents run through the same pipeline as a single chunk.
|
|
210
|
-
- **Concurrency Limiting**: To respect strict model rate limits, chunk extractions are processed in parallel with a strict concurrency limit (capped at 2 concurrent requests).
|
|
211
|
-
- **Candidate & Evidence Merging**: Chunk results are merged into schema-shaped candidates, with evidence coverage used to select scalar conflicts and preserve traceability.
|
|
212
|
-
- **Schema Validation & Correction**: Merged output is validated against the JSON Schema. When correction is needed, the corrected output is rechecked against evidence before being written.
|
|
213
|
-
|
|
214
|
-
<br>
|
|
215
|
-
|
|
216
205
|
## 🔧 AI Configuration
|
|
217
206
|
|
|
218
207
|
aiex works with any OpenAI-compatible API provider. Configure in the Web UI (AI Settings panel):
|
|
219
208
|
|
|
220
209
|
- **Provider** — Set your base URL and API key
|
|
221
210
|
- **Models** — Add models with vision and/or structured output capabilities
|
|
211
|
+
- **Documents** — Choose a PDF converter (`unpdf`, `mineru`, `mineru_api`, or `external`); image input automatically uses a vision model when available, otherwise system OCR on supported platforms
|
|
222
212
|
- **Prompts** — Customize system and user prompt templates with `{schema}` and `{text}` placeholders
|
|
223
213
|
- **Integrations** — Optionally connect Notion from AI Settings; use Connect & Map to bind a schema to an existing Notion data source
|
|
224
214
|
|