@arela/uploader 1.0.24 → 1.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/docs/AUTO_PROCESSING_PIPELINE.md +258 -0
- package/docs/COMPLETE_USAGE_GUIDE.md +1363 -0
- package/docs/DATABASESERVICE_IMPROVEMENTS.md +546 -0
- package/docs/PASO_2_TEST_RESULTS.md +298 -0
- package/docs/PASO_3_PLAN.md +385 -0
- package/docs/PHASE_1_FILE_DETECTION.md +366 -0
- package/docs/PHASE_2_API_INTEGRATION.md +426 -0
- package/docs/PHASE_3_DATABASE_MANAGEMENT.md +480 -0
- package/docs/PHASE_4_FILE_OPERATIONS.md +448 -0
- package/docs/PHASE_5_WATCH_MODE.md +450 -0
- package/docs/PHASE_6_SIGNAL_HANDLING.md +472 -0
- package/docs/PHASE_7_ADVANCED_FEATURES.md +560 -0
- package/docs/PLAN_WATCH_FEATURE.md +417 -0
- package/docs/README.md +480 -0
- package/docs/SCHEMA_ALIGNMENT_SUMMARY.md +301 -0
- package/docs/SMARTWATCH_DATABASE_REFACTORING.md +181 -0
- package/docs/SMART_WATCH_DATABASE_CHANGES.md +502 -0
- package/docs/TESTING_WATCH_MODE.md +212 -0
- package/docs/WATCHER_API_IMPLEMENTATION.md +520 -0
- package/docs/WATCHER_API_INTEGRATION.md +562 -0
- package/docs/WATCHER_SETUP_GUIDE.md +614 -0
- package/docs/WATCH_ARCHITECTURE.md +395 -0
- package/docs/WATCH_AUTO_PIPELINE.md +334 -0
- package/docs/WATCH_CONFIGURATION.md +267 -0
- package/docs/WATCH_USAGE_GUIDE.md +567 -0
- package/docs/commands.md +14 -0
- package/package.json +1 -1
- package/src/commands/IdentifyCommand.js +11 -0
- package/src/config/config.js +2 -2
- package/src/file-detection.js +42 -1
- package/src/scoring/scoring-engine.js +40 -7
- package/src/services/LoggingService.js +5 -3
- package/.vscode/settings.json +0 -1
- package/coverage/IdentifyCommand.js.html +0 -1462
- package/coverage/PropagateCommand.js.html +0 -1507
- package/coverage/PushCommand.js.html +0 -1504
- package/coverage/ScanCommand.js.html +0 -1654
- package/coverage/UploadCommand.js.html +0 -1846
- package/coverage/WatchCommand.js.html +0 -4111
- package/coverage/base.css +0 -224
- package/coverage/block-navigation.js +0 -87
- package/coverage/favicon.png +0 -0
- package/coverage/index.html +0 -191
- package/coverage/lcov-report/IdentifyCommand.js.html +0 -1462
- package/coverage/lcov-report/PropagateCommand.js.html +0 -1507
- package/coverage/lcov-report/PushCommand.js.html +0 -1504
- package/coverage/lcov-report/ScanCommand.js.html +0 -1654
- package/coverage/lcov-report/UploadCommand.js.html +0 -1846
- package/coverage/lcov-report/WatchCommand.js.html +0 -4111
- package/coverage/lcov-report/base.css +0 -224
- package/coverage/lcov-report/block-navigation.js +0 -87
- package/coverage/lcov-report/favicon.png +0 -0
- package/coverage/lcov-report/index.html +0 -191
- package/coverage/lcov-report/prettify.css +0 -1
- package/coverage/lcov-report/prettify.js +0 -2
- package/coverage/lcov-report/sort-arrow-sprite.png +0 -0
- package/coverage/lcov-report/sorter.js +0 -210
- package/coverage/lcov.info +0 -1937
- package/coverage/prettify.css +0 -1
- package/coverage/prettify.js +0 -2
- package/coverage/sort-arrow-sprite.png +0 -0
- package/coverage/sorter.js +0 -210
- package/docs/API_ENDPOINTS_FOR_DETECTION.md +0 -647
- package/docs/API_RETRY_MECHANISM.md +0 -338
- package/docs/ARELA_IDENTIFY_IMPLEMENTATION.md +0 -489
- package/docs/ARELA_IDENTIFY_QUICKREF.md +0 -186
- package/docs/ARELA_PROPAGATE_IMPLEMENTATION.md +0 -581
- package/docs/ARELA_PROPAGATE_QUICKREF.md +0 -272
- package/docs/ARELA_PUSH_IMPLEMENTATION.md +0 -577
- package/docs/ARELA_PUSH_QUICKREF.md +0 -322
- package/docs/ARELA_SCAN_IMPLEMENTATION.md +0 -373
- package/docs/ARELA_SCAN_QUICKREF.md +0 -139
- package/docs/CROSS_PLATFORM_PATH_HANDLING.md +0 -597
- package/docs/DETECTION_ATTEMPT_TRACKING.md +0 -414
- package/docs/MIGRATION_UPLOADER_TO_FILE_STATS.md +0 -1020
- package/docs/MULTI_LEVEL_DIRECTORY_SCANNING.md +0 -494
- package/docs/QUICK_REFERENCE_API_DETECTION.md +0 -264
- package/docs/REFACTORING_SUMMARY_DETECT_PEDIMENTOS.md +0 -200
- package/docs/STATS_COMMAND_SEQUENCE_DIAGRAM.md +0 -287
- package/docs/STATS_COMMAND_SIMPLE.md +0 -93
|
@@ -0,0 +1,258 @@
|
|
|
1
|
+
# Automatic 4-Step Processing Pipeline
|
|
2
|
+
|
|
3
|
+
## Overview
|
|
4
|
+
|
|
5
|
+
The automatic processing pipeline enables Watch Mode to automatically execute a 4-step workflow whenever a new file is detected in monitored directories. This streamlines the file upload process without manual intervention.
|
|
6
|
+
|
|
7
|
+
## The 4-Step Pipeline
|
|
8
|
+
|
|
9
|
+
When auto-processing is enabled and a new file is detected, the pipeline automatically executes:
|
|
10
|
+
|
|
11
|
+
### Step 1: Stats Collection (`stats --stats-only`)
|
|
12
|
+
- Collects file statistics and metadata
|
|
13
|
+
- Uploads file information to the database (uploads table)
|
|
14
|
+
- Records file properties: size, type, modification date, path
|
|
15
|
+
|
|
16
|
+
### Step 2: PDF/Pedimento Detection (`detect --detect-pdfs`)
|
|
17
|
+
- Analyzes files to detect "Pedimento Simplificado" documents
|
|
18
|
+
- Identifies which files are simplified customs documents
|
|
19
|
+
- Updates database records with detection results
|
|
20
|
+
- **Important**: Files in the same directory will only be uploaded if a Pedimento Simplificado is detected
|
|
21
|
+
|
|
22
|
+
### Step 3: Arela Path Propagation (`detect --propagate-arela-path`)
|
|
23
|
+
- Propagates `arela_path` from detected Pedimento records to related files
|
|
24
|
+
- Ensures all documents in the same directory are correctly linked
|
|
25
|
+
- Creates the hierarchical relationship between main and supporting documents
|
|
26
|
+
|
|
27
|
+
### Step 4: RFC-Based Upload (`upload --upload-by-rfc --folder-structure`)
|
|
28
|
+
- Uploads files based on RFC values from `UPLOAD_RFCS` configuration
|
|
29
|
+
- Uses the configured folder structure for bucket organization
|
|
30
|
+
- Each watched directory can have its own folder structure
|
|
31
|
+
|
|
32
|
+
## Configuration
|
|
33
|
+
|
|
34
|
+
### 1. Environment Setup (`.env`)
|
|
35
|
+
|
|
36
|
+
Configure watched directories with associated folder structures:
|
|
37
|
+
|
|
38
|
+
```env
|
|
39
|
+
# JSON format: {"directory_path": "folder_structure"}
|
|
40
|
+
WATCH_DIRECTORY_CONFIGS={"../../Documents/2022":"estructura-2022","../../Documents/2023":"estructura-2023"}
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
Each directory can have a unique folder structure:
|
|
44
|
+
- `../../Documents/2022` → uploads to bucket structure `estructura-2022`
|
|
45
|
+
- `../../Documents/2023` → uploads to bucket structure `estructura-2023`
|
|
46
|
+
|
|
47
|
+
### 2. CLI Options
|
|
48
|
+
|
|
49
|
+
Enable auto-processing when running watch mode:
|
|
50
|
+
|
|
51
|
+
```bash
|
|
52
|
+
# Basic usage with auto-processing enabled
|
|
53
|
+
arela watch --auto-processing
|
|
54
|
+
|
|
55
|
+
# With custom directories
|
|
56
|
+
arela watch -d "../../Documents/2022,../../Documents/2023" --auto-processing
|
|
57
|
+
|
|
58
|
+
# With other options
|
|
59
|
+
arela watch --auto-processing -b 10 --debounce 1000 -s batch
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
### 3. Processing Options
|
|
63
|
+
|
|
64
|
+
When auto-processing is enabled, you can configure:
|
|
65
|
+
|
|
66
|
+
| Option | Default | Description |
|
|
67
|
+
|--------|---------|-------------|
|
|
68
|
+
| `--batch-size` / `-b` | 10 | Files to process per batch in stats collection |
|
|
69
|
+
| `--debounce` | 1000ms | Wait time before processing after file event |
|
|
70
|
+
| `--auto-processing` | - | Enable the 4-step pipeline |
|
|
71
|
+
|
|
72
|
+
## Workflow Example
|
|
73
|
+
|
|
74
|
+
### Scenario: New file detected in monitored directory
|
|
75
|
+
|
|
76
|
+
```
|
|
77
|
+
📄 New file: /Documents/2023/AKS151005E46/invoice.pdf
|
|
78
|
+
↓
|
|
79
|
+
⚡ File Event Detected (add)
|
|
80
|
+
↓
|
|
81
|
+
┌─────────────────────────────────────────┐
|
|
82
|
+
│ Step 1: Stats Collection │
|
|
83
|
+
│ → Collects file info │
|
|
84
|
+
│ → Updates uploads table │
|
|
85
|
+
└─────────────────────────────────────────┘
|
|
86
|
+
↓
|
|
87
|
+
┌─────────────────────────────────────────┐
|
|
88
|
+
│ Step 2: PDF Detection │
|
|
89
|
+
│ → Searches for Pedimento Simplificado │
|
|
90
|
+
│ → Updates detection status │
|
|
91
|
+
└─────────────────────────────────────────┘
|
|
92
|
+
↓
|
|
93
|
+
┌─────────────────────────────────────────┐
|
|
94
|
+
│ Step 3: Arela Path Propagation │
|
|
95
|
+
│ → Links related documents │
|
|
96
|
+
│ → Propagates metadata │
|
|
97
|
+
└─────────────────────────────────────────┘
|
|
98
|
+
↓
|
|
99
|
+
┌─────────────────────────────────────────┐
|
|
100
|
+
│ Step 4: RFC Upload │
|
|
101
|
+
│ → Identifies RFC from directory/file │
|
|
102
|
+
│ → Uploads with folder structure │
|
|
103
|
+
│ → Creates bucket organization │
|
|
104
|
+
└─────────────────────────────────────────┘
|
|
105
|
+
↓
|
|
106
|
+
✅ Processing Complete
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
## Important Constraints
|
|
110
|
+
|
|
111
|
+
### Pedimento Detection Requirement
|
|
112
|
+
- If a Pedimento Simplificado is NOT detected in Step 2, documents will not be uploaded
|
|
113
|
+
- Ensure your documents include the proper Pedimento Simplificado format
|
|
114
|
+
- Check database records for detection status:
|
|
115
|
+
```bash
|
|
116
|
+
arela query --ready-files
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
### RFC Configuration
|
|
120
|
+
- Ensure `UPLOAD_RFCS` environment variable contains required RFCs
|
|
121
|
+
- Files from unregistered RFCs will not be processed
|
|
122
|
+
- Example configuration:
|
|
123
|
+
```env
|
|
124
|
+
UPLOAD_RFCS=AKS151005E46|IMS030409FZ0|RDG1107154L7
|
|
125
|
+
```
|
|
126
|
+
|
|
127
|
+
## Monitoring and Debugging
|
|
128
|
+
|
|
129
|
+
### Check Auto-Processing Status
|
|
130
|
+
|
|
131
|
+
The watch command logs auto-processing events:
|
|
132
|
+
|
|
133
|
+
```
|
|
134
|
+
[AutoPipeline] Triggering 4-step processing pipeline for: /path/to/file.pdf
|
|
135
|
+
[AutoPipeline] Step 1/4: Stats collection...
|
|
136
|
+
[AutoPipeline] ✅ Stats collection completed
|
|
137
|
+
[AutoPipeline] Step 2/4: PDF detection...
|
|
138
|
+
[AutoPipeline] ✅ PDF detection completed
|
|
139
|
+
...
|
|
140
|
+
[AutoPipeline] ✅ Pipeline completed successfully (ID: pipeline-xxx-yyy)
|
|
141
|
+
```
|
|
142
|
+
|
|
143
|
+
### Query Ready Files
|
|
144
|
+
|
|
145
|
+
Check which files are prepared for upload:
|
|
146
|
+
|
|
147
|
+
```bash
|
|
148
|
+
arela query --ready-files
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
### Manual Commands (if needed)
|
|
152
|
+
|
|
153
|
+
Run individual steps manually:
|
|
154
|
+
|
|
155
|
+
```bash
|
|
156
|
+
# Stats only
|
|
157
|
+
arela stats --stats-only
|
|
158
|
+
|
|
159
|
+
# PDF Detection
|
|
160
|
+
arela detect --detect-pdfs
|
|
161
|
+
|
|
162
|
+
# Path Propagation
|
|
163
|
+
arela detect --propagate-arela-path
|
|
164
|
+
|
|
165
|
+
# RFC Upload
|
|
166
|
+
arela upload --upload-by-rfc --folder-structure estructura-2023
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
## Performance Considerations
|
|
170
|
+
|
|
171
|
+
- **Debounce Time**: Default 1000ms prevents redundant processing
|
|
172
|
+
- **Batch Size**: Affects concurrent processing in Step 1
|
|
173
|
+
- **Pipeline Concurrency**: Only one pipeline runs at a time (prevents conflicts)
|
|
174
|
+
- **Auto-Processing Overhead**: Each new file triggers all 4 steps (approximately 10-30 seconds depending on file count)
|
|
175
|
+
|
|
176
|
+
## Disabling Auto-Processing
|
|
177
|
+
|
|
178
|
+
To use watch mode without auto-processing:
|
|
179
|
+
|
|
180
|
+
```bash
|
|
181
|
+
# Watch mode without auto-processing
|
|
182
|
+
arela watch
|
|
183
|
+
|
|
184
|
+
# Files detected but not automatically processed
|
|
185
|
+
# Use manual commands instead:
|
|
186
|
+
arela stats --stats-only
|
|
187
|
+
arela detect --detect-pdfs
|
|
188
|
+
arela detect --propagate-arela-path
|
|
189
|
+
arela upload --upload-by-rfc
|
|
190
|
+
```
|
|
191
|
+
|
|
192
|
+
## Troubleshooting
|
|
193
|
+
|
|
194
|
+
### Pipeline Not Triggering
|
|
195
|
+
1. Check if `--auto-processing` flag is set
|
|
196
|
+
2. Verify `WATCH_DIRECTORY_CONFIGS` is properly configured in `.env`
|
|
197
|
+
3. Check logs for configuration errors: `--verbose` flag
|
|
198
|
+
|
|
199
|
+
### Files Not Uploading (Step 4 Failing)
|
|
200
|
+
1. Verify Pedimento was detected: `arela query --ready-files`
|
|
201
|
+
2. Check `UPLOAD_RFCS` configuration
|
|
202
|
+
3. Verify folder structure is valid
|
|
203
|
+
|
|
204
|
+
### Performance Issues
|
|
205
|
+
- Increase debounce time: `--debounce 2000`
|
|
206
|
+
- Reduce batch size: `-b 5`
|
|
207
|
+
- Check system resources during processing
|
|
208
|
+
|
|
209
|
+
## Files Modified
|
|
210
|
+
|
|
211
|
+
- **`.env`**: New `WATCH_DIRECTORY_CONFIGS` format
|
|
212
|
+
- **`src/config/config.js`**: Parser for JSON directory configuration
|
|
213
|
+
- **`src/services/AutoProcessingService.js`**: New service for 4-step pipeline
|
|
214
|
+
- **`src/services/WatchService.js`**: Integration of auto-processing
|
|
215
|
+
- **`src/commands/WatchCommand.js`**: Support for directory configurations
|
|
216
|
+
- **`src/index.js`**: New `--auto-processing` CLI option
|
|
217
|
+
- **`src/utils/WatchEventHandler.js`**: Pipeline invocation support
|
|
218
|
+
|
|
219
|
+
## API Reference
|
|
220
|
+
|
|
221
|
+
### WatchService Methods
|
|
222
|
+
|
|
223
|
+
```javascript
|
|
224
|
+
// Enable automatic processing
|
|
225
|
+
watchService.enableAutoProcessing({ batchSize: 10 });
|
|
226
|
+
|
|
227
|
+
// Disable automatic processing
|
|
228
|
+
watchService.disableAutoProcessing();
|
|
229
|
+
|
|
230
|
+
// Check if enabled
|
|
231
|
+
const enabled = watchService.isAutoProcessingEnabled();
|
|
232
|
+
|
|
233
|
+
// Get stats including pipeline count
|
|
234
|
+
const stats = watchService.getStats();
|
|
235
|
+
// Output: { pipelinesTriggered: 5, ... }
|
|
236
|
+
```
|
|
237
|
+
|
|
238
|
+
### AutoProcessingService Methods
|
|
239
|
+
|
|
240
|
+
```javascript
|
|
241
|
+
// Execute the 4-step pipeline
|
|
242
|
+
const result = await autoProcessingService.executeProcessingPipeline({
|
|
243
|
+
filePath: '/path/to/file.pdf',
|
|
244
|
+
watchDir: '/watched/directory',
|
|
245
|
+
folderStructure: 'estructura-2023',
|
|
246
|
+
batchSize: 10
|
|
247
|
+
});
|
|
248
|
+
|
|
249
|
+
// Returns:
|
|
250
|
+
// {
|
|
251
|
+
// pipelineId: 'pipeline-xxx-yyy',
|
|
252
|
+
// summary: {
|
|
253
|
+
// success: true,
|
|
254
|
+
// message: '✅ All 4 steps completed successfully!',
|
|
255
|
+
// details: { ... }
|
|
256
|
+
// }
|
|
257
|
+
// }
|
|
258
|
+
```
|