aiex-cli 0.0.3-beta.4 → 0.0.3-beta.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -24,6 +24,7 @@ npm install -g aiex-cli
24
24
  aiex web # configure schemas, AI, integrations, and inspect data
25
25
  aiex schema # generate SQLite from JSON Schema files
26
26
  aiex extract -s invoice -f invoice.pdf # extract data with AI and insert into database
27
+ aiex watch -s invoice -d ./watch_folder # watch folder daemon for automatic extraction
27
28
  ```
28
29
 
29
30
  <br>
@@ -32,9 +33,11 @@ aiex extract -s invoice -f invoice.pdf # extract data with AI and insert into d
32
33
 
33
34
  - **JSON Schema → SQLite** — Define tables as JSON Schema files, generate Drizzle ORM schema, and migrate to SQLite
34
35
  - **Web Configuration & Viewer** — Browser-based UI for designing schemas, configuring integrations, previewing prompts, and browsing extracted data
35
- - **AI Extraction** — Extract structured data from text, images, and PDFs using any OpenAI-compatible provider (OpenAI, Anthropic, Ollama, DeepSeek, local models, etc.)
36
+ - **AI Extraction** — Extract structured data from files (text, images, PDFs) using any OpenAI-compatible provider (OpenAI, Anthropic, Ollama, DeepSeek, local models, etc.)
36
37
  - **Interactive Mode** — Run `aiex extract` without arguments for a guided extraction workflow
37
38
  - **Batch Mode** — `aiex extract -d <dir>` processes entire directories with optional glob filtering
39
+ - **Incremental Extraction** — File hash deduplication skips already-processed files; use `--force` to override
40
+ - **Data Export** — `aiex export` exports SQLite tables to CSV or Excel (.xlsx)
38
41
  - **Notion Sync** — Optionally sync CLI extraction results to configured Notion data sources
39
42
  - **Extraction Audit Trail** — Every extraction is recorded with status, input source, output path, token usage, database inserts, Notion pages, and errors
40
43
  - **Built-in Model Registry** — Knows capabilities of 2000+ models (vision, structured output) so you don't have to guess
@@ -64,27 +67,28 @@ Converts your JSON Schema files into a SQLite database with full migration suppo
64
67
  ```bash
65
68
  aiex extract # interactive mode (prompts for schema & input)
66
69
  aiex extract -s <schema> -f <file> # from file (txt, pdf, png, jpg, ...)
67
- aiex extract -s <schema> -t <text> # from text
68
- aiex extract -s <schema> -f <file> -m <model> # specify AI model (overrides auto-selection)
69
- aiex extract -s <schema> -f <file> --no-insert # extract and save JSON without inserting into SQLite
70
- aiex extract -s <schema> -d <directory> # batch extract all supported files in a directory
71
- aiex extract -s <schema> -d <dir> -g "*.pdf" # batch with glob filter
72
- aiex extract history # list extraction audit records
73
- aiex extract show <audit-id> # show full audit record JSON
74
- aiex extract retry <audit-id> # retry a previous extraction
75
- aiex extract rm <audit-id> # delete an audit record and cached upload
70
+ aiex extract -s <schema> -f <file> -m <model> # specify AI model (overrides auto-selection)
71
+ aiex extract -s <schema> -f <file> --no-insert # extract and save JSON without inserting into SQLite
72
+ aiex extract -s <schema> -f <file> --force # force re-extraction even if already processed
73
+ aiex extract -s <schema> -d <directory> # batch extract all supported files in a directory
74
+ aiex extract -s <schema> -d <dir> -g "*.pdf" # batch with glob filter
75
+ aiex extract history # list extraction audit records
76
+ aiex extract show <audit-id> # show full audit record JSON
77
+ aiex extract retry <audit-id> # retry a previous extraction
78
+ aiex extract rm <audit-id> # delete an audit record and cached upload
76
79
  ```
77
80
 
78
81
  The AI reads your document and outputs structured JSON matching your schema.
79
82
 
80
83
  **Examples:**
81
84
  ```bash
82
- aiex extract # interactive mode
85
+ aiex extract # interactive mode
83
86
  aiex extract -s paper -f research.pdf # save result to .aiex/extracted/ and insert into database
84
- aiex extract -s paper -f research.pdf --no-insert # save result only, skip database insert
87
+ aiex extract -s paper -f research.pdf --no-insert # save result only, skip database insert
85
88
  aiex extract -s paper -f research.pdf -m gpt-4o # use a specific model
86
- aiex extract -s paper -d ./papers -g "*.pdf" # batch extract PDFs from a directory
87
- aiex extract history # inspect recent extraction runs
89
+ aiex extract -s paper -f research.pdf --force # force re-extraction even if already processed
90
+ aiex extract -s paper -d ./papers -g "*.pdf" # batch extract PDFs from a directory
91
+ aiex extract history # inspect recent extraction runs
88
92
  ```
89
93
  Saves the extracted result to `.aiex/extracted/<schema-name>-<timestamp>.json` with fields like `title`, `firstAuthor`, `journal`, `year` — exactly as defined in your schema. Data is automatically inserted into the SQLite database.
90
94
 
@@ -92,6 +96,24 @@ By default, aiex automatically selects a model based on your input type (vision-
92
96
 
93
97
  Every extraction is also recorded under `.aiex/extracted/_audit/`. Audit records include the run status (`running`, `succeeded`, `failed`, or `stale`), schema name, input source, output file, token usage, inserted table rows, synced Notion pages, retry lineage, and error message. Deleting an audit record removes its cached upload, but keeps extracted JSON result files to avoid accidental data loss.
94
98
 
99
+ ### 4. Watch Folder Daemon (Auto-Extraction)
100
+
101
+ ```bash
102
+ aiex watch -s <schema> -d <folder>
103
+ ```
104
+
105
+ Runs a background watcher daemon to monitor a folder for new incoming files (such as scanned documents or downloads), automatically performing offline data extraction, database insertion, and system notifications.
106
+
107
+ ### 5. Export Data
108
+
109
+ ```bash
110
+ aiex export -s <schema> # export to CSV (default)
111
+ aiex export -s <schema> -f xlsx -o output.xlsx # export to Excel
112
+ aiex export -t <table> -f csv -o output.csv # export a specific table by name
113
+ ```
114
+
115
+ Exports all extracted data for a given schema (or table) from the SQLite database to CSV or Excel format.
116
+
95
117
  <br>
96
118
 
97
119
  ## 📖 Commands
@@ -101,10 +123,11 @@ Every extraction is also recorded under `.aiex/extracted/_audit/`. Audit records
101
123
  | `aiex schema` | Parse JSON Schema files and migrate to SQLite |
102
124
  | `aiex schema --generate` | Generate Drizzle schema code only (skip migration) |
103
125
  | `aiex web` | Launch visual schema/configuration UI and data viewer in browser |
104
- | `aiex extract` | Interactive mode — prompts for schema and input source |
105
- | `aiex extract -s <name> -f <file>` | Extract structured data from documents and insert into SQLite database |
126
+ | `aiex extract` | Interactive mode — prompts for schema and file/directory input |
127
+ | `aiex extract -s <name> -f <file>` | Extract structured data from a file and insert into SQLite database |
106
128
  | `aiex extract -s <name> -f <file> -m <model>` | Extract with a specific AI model |
107
129
  | `aiex extract -s <name> -f <file> --no-insert` | Extract and save JSON without inserting into SQLite |
130
+ | `aiex extract -s <name> -f <file> --force` | Force re-extraction even if the file has already been processed |
108
131
  | `aiex extract -s <name> -d <dir>` | Batch extract all supported files in a directory |
109
132
  | `aiex extract -s <name> -d <dir> -g "*.pdf"` | Batch extract with glob filter |
110
133
  | `aiex extract history` | List extraction audit records |
@@ -112,6 +135,10 @@ Every extraction is also recorded under `.aiex/extracted/_audit/`. Audit records
112
135
  | `aiex extract retry <audit-id>` | Retry a previous extraction run |
113
136
  | `aiex extract retry <audit-id> --no-insert` | Retry without inserting into SQLite |
114
137
  | `aiex extract rm <audit-id>` | Delete an audit record and its cached upload |
138
+ | `aiex watch -s <name> -d <dir>` | Watch a directory for new files and automatically extract data |
139
+ | `aiex watch -s <name> -d <dir> --no-insert` | Watch and save JSON without inserting into SQLite |
140
+ | `aiex export -s <name>` | Export extracted data for a schema to CSV |
141
+ | `aiex export -s <name> -f xlsx -o <file>` | Export to Excel (.xlsx) |
115
142
  | `aiex doctor` | System and configuration diagnostics |
116
143
  | `aiex completion bash\|zsh\|fish` | Generate shell completion scripts |
117
144