aiex-cli 0.0.3-beta.4 → 0.0.3-beta.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -24,6 +24,7 @@ npm install -g aiex-cli
24
24
  aiex web # configure schemas, AI, integrations, and inspect data
25
25
  aiex schema # generate SQLite from JSON Schema files
26
26
  aiex extract -s invoice -f invoice.pdf # extract data with AI and insert into database
27
+ aiex watch -s invoice -d ./watch_folder # watch folder daemon for automatic extraction
27
28
  ```
28
29
 
29
30
  <br>
@@ -32,9 +33,11 @@ aiex extract -s invoice -f invoice.pdf # extract data with AI and insert into d
32
33
 
33
34
  - **JSON Schema → SQLite** — Define tables as JSON Schema files, generate Drizzle ORM schema, and migrate to SQLite
34
35
  - **Web Configuration & Viewer** — Browser-based UI for designing schemas, configuring integrations, previewing prompts, and browsing extracted data
35
- - **AI Extraction** — Extract structured data from text, images, and PDFs using any OpenAI-compatible provider (OpenAI, Anthropic, Ollama, DeepSeek, local models, etc.)
36
+ - **AI Extraction** — Extract structured data from files (text, images, PDFs) using any OpenAI-compatible provider (OpenAI, Anthropic, Ollama, DeepSeek, local models, etc.)
36
37
  - **Interactive Mode** — Run `aiex extract` without arguments for a guided extraction workflow
37
38
  - **Batch Mode** — `aiex extract -d <dir>` processes entire directories with optional glob filtering
39
+ - **Incremental Extraction** — File hash deduplication skips already-processed files; use `--force` to override
40
+ - **Data Dump** — `aiex dump` exports SQLite tables to CSV or Excel (.xlsx)
38
41
  - **Notion Sync** — Optionally sync CLI extraction results to configured Notion data sources
39
42
  - **Extraction Audit Trail** — Every extraction is recorded with status, input source, output path, token usage, database inserts, Notion pages, and errors
40
43
  - **Built-in Model Registry** — Knows capabilities of 2000+ models (vision, structured output) so you don't have to guess
@@ -64,27 +67,28 @@ Converts your JSON Schema files into a SQLite database with full migration suppo
64
67
  ```bash
65
68
  aiex extract # interactive mode (prompts for schema & input)
66
69
  aiex extract -s <schema> -f <file> # from file (txt, pdf, png, jpg, ...)
67
- aiex extract -s <schema> -t <text> # from text
68
- aiex extract -s <schema> -f <file> -m <model> # specify AI model (overrides auto-selection)
69
- aiex extract -s <schema> -f <file> --no-insert # extract and save JSON without inserting into SQLite
70
- aiex extract -s <schema> -d <directory> # batch extract all supported files in a directory
71
- aiex extract -s <schema> -d <dir> -g "*.pdf" # batch with glob filter
72
- aiex extract history # list extraction audit records
73
- aiex extract show <audit-id> # show full audit record JSON
74
- aiex extract retry <audit-id> # retry a previous extraction
75
- aiex extract rm <audit-id> # delete an audit record and cached upload
70
+ aiex extract -s <schema> -f <file> -m <model> # specify AI model (overrides auto-selection)
71
+ aiex extract -s <schema> -f <file> --no-insert # extract and save JSON without inserting into SQLite
72
+ aiex extract -s <schema> -f <file> --force # force re-extraction even if already processed
73
+ aiex extract -s <schema> -d <directory> # batch extract all supported files in a directory
74
+ aiex extract -s <schema> -d <dir> -g "*.pdf" # batch with glob filter
75
+ aiex extract history # list extraction audit records
76
+ aiex extract show <audit-id> # show full audit record JSON
77
+ aiex extract retry <audit-id> # retry a previous extraction
78
+ aiex extract rm <audit-id> # delete an audit record and cached upload
76
79
  ```
77
80
 
78
81
  The AI reads your document and outputs structured JSON matching your schema.
79
82
 
80
83
  **Examples:**
81
84
  ```bash
82
- aiex extract # interactive mode
85
+ aiex extract # interactive mode
83
86
  aiex extract -s paper -f research.pdf # save result to .aiex/extracted/ and insert into database
84
- aiex extract -s paper -f research.pdf --no-insert # save result only, skip database insert
87
+ aiex extract -s paper -f research.pdf --no-insert # save result only, skip database insert
85
88
  aiex extract -s paper -f research.pdf -m gpt-4o # use a specific model
86
- aiex extract -s paper -d ./papers -g "*.pdf" # batch extract PDFs from a directory
87
- aiex extract history # inspect recent extraction runs
89
+ aiex extract -s paper -f research.pdf --force # force re-extraction even if already processed
90
+ aiex extract -s paper -d ./papers -g "*.pdf" # batch extract PDFs from a directory
91
+ aiex extract history # inspect recent extraction runs
88
92
  ```
89
93
  Saves the extracted result to `.aiex/extracted/<schema-name>-<timestamp>.json` with fields like `title`, `firstAuthor`, `journal`, `year` — exactly as defined in your schema. Data is automatically inserted into the SQLite database.
90
94
 
@@ -92,6 +96,24 @@ By default, aiex automatically selects a model based on your input type (vision-
92
96
 
93
97
  Every extraction is also recorded under `.aiex/extracted/_audit/`. Audit records include the run status (`running`, `succeeded`, `failed`, or `stale`), schema name, input source, output file, token usage, inserted table rows, synced Notion pages, retry lineage, and error message. Deleting an audit record removes its cached upload, but keeps extracted JSON result files to avoid accidental data loss.
94
98
 
99
+ ### 4. Watch Folder Daemon (Auto-Extraction)
100
+
101
+ ```bash
102
+ aiex watch -s <schema> -d <folder>
103
+ ```
104
+
105
+ Runs a background watcher daemon to monitor a folder for new incoming files (such as scanned documents or downloads), automatically performing offline data extraction, database insertion, and system notifications.
106
+
107
+ ### 5. Dump Data
108
+
109
+ ```bash
110
+ aiex dump -s <schema> # dump to CSV (default)
111
+ aiex dump -s <schema> -f xlsx -o output.xlsx # dump to Excel
112
+ aiex dump -t <table> -f csv -o output.csv # dump a specific table by name
113
+ ```
114
+
115
+ Dumps all extracted data for a given schema (or table) from the SQLite database to CSV or Excel format.
116
+
95
117
  <br>
96
118
 
97
119
  ## 📖 Commands
@@ -101,10 +123,11 @@ Every extraction is also recorded under `.aiex/extracted/_audit/`. Audit records
101
123
  | `aiex schema` | Parse JSON Schema files and migrate to SQLite |
102
124
  | `aiex schema --generate` | Generate Drizzle schema code only (skip migration) |
103
125
  | `aiex web` | Launch visual schema/configuration UI and data viewer in browser |
104
- | `aiex extract` | Interactive mode — prompts for schema and input source |
105
- | `aiex extract -s <name> -f <file>` | Extract structured data from documents and insert into SQLite database |
126
+ | `aiex extract` | Interactive mode — prompts for schema and file/directory input |
127
+ | `aiex extract -s <name> -f <file>` | Extract structured data from a file and insert into SQLite database |
106
128
  | `aiex extract -s <name> -f <file> -m <model>` | Extract with a specific AI model |
107
129
  | `aiex extract -s <name> -f <file> --no-insert` | Extract and save JSON without inserting into SQLite |
130
+ | `aiex extract -s <name> -f <file> --force` | Force re-extraction even if the file has already been processed |
108
131
  | `aiex extract -s <name> -d <dir>` | Batch extract all supported files in a directory |
109
132
  | `aiex extract -s <name> -d <dir> -g "*.pdf"` | Batch extract with glob filter |
110
133
  | `aiex extract history` | List extraction audit records |
@@ -112,12 +135,18 @@ Every extraction is also recorded under `.aiex/extracted/_audit/`. Audit records
112
135
  | `aiex extract retry <audit-id>` | Retry a previous extraction run |
113
136
  | `aiex extract retry <audit-id> --no-insert` | Retry without inserting into SQLite |
114
137
  | `aiex extract rm <audit-id>` | Delete an audit record and its cached upload |
138
+ | `aiex watch -s <name> -d <dir>` | Watch a directory for new files and automatically extract data |
139
+ | `aiex watch -s <name> -d <dir> --no-insert` | Watch and save JSON without inserting into SQLite |
140
+ | `aiex dump -s <name>` | Dump extracted data for a schema to CSV |
141
+ | `aiex dump -s <name> -f xlsx -o <file>` | Dump to Excel (.xlsx) |
115
142
  | `aiex doctor` | System and configuration diagnostics |
116
143
  | `aiex completion bash\|zsh\|fish` | Generate shell completion scripts |
117
144
 
118
145
  ### Shell Completions
119
146
 
120
- Enable tab completion for commands and options:
147
+ Each release ships pre-generated completion files in `dist/completions/`. You can use either the dynamic method or install them permanently.
148
+
149
+ **Dynamic (session only):**
121
150
 
122
151
  ```bash
123
152
  # bash
@@ -130,9 +159,25 @@ source <(aiex completion zsh)
130
159
  aiex completion fish | source
131
160
  ```
132
161
 
133
- To make it permanent, add the `source` line to your shell config file (`~/.bashrc`, `~/.zshrc`, or `~/.config/fish/config.fish`).
162
+ **Permanent install (recommended):**
163
+
164
+ ```bash
165
+ # bash — write to system completions directory
166
+ aiex completion bash > /etc/bash_completion.d/aiex
167
+ # or for user-level (no sudo):
168
+ mkdir -p ~/.local/share/bash-completion/completions
169
+ aiex completion bash > ~/.local/share/bash-completion/completions/aiex
170
+
171
+ # zsh — write to a directory in $fpath
172
+ aiex completion zsh > "${fpath[1]}/_aiex"
173
+ # or use the pre-built file from the package:
174
+ # $(npm root -g)/aiex-cli/dist/completions/aiex.zsh
175
+
176
+ # fish — write to fish completions directory
177
+ aiex completion fish > ~/.config/fish/completions/aiex.fish
178
+ ```
134
179
 
135
- > Completions are dynamically generated from the command definitions no manual updates needed when commands or options change.
180
+ > Pre-built completion files are also available in the installed package at `node_modules/aiex-cli/dist/completions/`, so Homebrew formulae, oh-my-zsh plugins, and other package managers can reference them directly without running `aiex completion`.
136
181
 
137
182
  <br>
138
183