@arela/uploader 1.0.1 → 1.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,272 @@
1
+ # Arela Propagate Quick Reference
2
+
3
+ ## Command
4
+
5
+ ```bash
6
+ arela propagate [options]
7
+ ```
8
+
9
+ ## Options
10
+
11
+ | Option | Default | Description |
12
+ |--------|---------|-------------|
13
+ | `--api <target>` | `default` | API target: default\|agencia\|cliente |
14
+ | `-b, --batch-size <size>` | `50` | Pedimentos per batch |
15
+ | `--show-stats` | `false` | Show detailed statistics |
16
+
17
+ ## Prerequisites
18
+
19
+ 1. **Run `arela scan` first** - Propagate requires scanned files
20
+ 2. **Run `arela identify` first** - Need detected pedimentos
21
+ 3. **Same configuration** - Use same env vars as scan/identify commands
22
+
23
+ ## Required Environment Variables
24
+
25
+ ```bash
26
+ ARELA_COMPANY_SLUG=your_company
27
+ ARELA_SERVER_ID=server01
28
+ UPLOAD_BASE_PATH=/path/to/files
29
+ UPLOAD_SOURCES=2023|2024|2025
30
+ ```
31
+
32
+ ## Examples
33
+
34
+ ```bash
35
+ # Basic propagation
36
+ arela propagate
37
+
38
+ # Use specific API
39
+ arela propagate --api agencia
40
+
41
+ # Faster for large datasets
42
+ arela propagate --batch-size 100
43
+
44
+ # With detailed stats
45
+ arela propagate --show-stats
46
+ ```
47
+
48
+ ## What It Does
49
+
50
+ 1. Fetches pedimentos with `arela_path` from `file_stats_*` table
51
+ 2. For each pedimento, finds files in the same directory
52
+ 3. Propagates `arela_path`, `rfc`, and `year` to related files
53
+ 4. Tracks attempts and errors for monitoring
54
+ 5. Updates results in database in batches
55
+
56
+ ## Output Example
57
+
58
+ ```
59
+ 🔄 Starting arela propagate command
60
+
61
+ 📊 Table: file_stats_acme_corp_nas01_data
62
+ 🎯 API Target: default
63
+ 📦 Batch Size: 50
64
+
65
+ 📈 Propagation Status:
66
+ Total Files: 5000
67
+ With arela_path: 850
68
+ Pedimento Sources: 850
69
+ Needs Propagation: 2000
70
+ Pending: 2000
71
+ Errors: 0
72
+
73
+ 🏷️ Marking files needing propagation...
74
+ ✓ Marked 2000 files
75
+
76
+ 🚀 Processing propagation...
77
+
78
+ 📄 Propagating |████████████████████| 100% | 850/850 directories | 145 files/sec | 2000 files updated
79
+
80
+ 📊 Results:
81
+ Pedimentos Processed: 850
82
+ Directories Processed: 850
83
+ Files Updated: 2000
84
+ Errors: 0
85
+ Duration: 13.8s
86
+ Speed: 145 files/sec
87
+
88
+ ✅ Propagation Complete!
89
+
90
+ 📈 Final Status:
91
+ Total Files: 5000
92
+ With arela_path: 2850
93
+ Needs Propagation: 2000
94
+ Pending: 0
95
+ Errors: 0
96
+ ```
97
+
98
+ ## Backend Endpoints Used
99
+
100
+ ```
101
+ POST /api/uploader/scan/mark-propagation?tableName=X
102
+ GET /api/uploader/scan/pedimento-sources?tableName=X&offset=0&limit=50
103
+ GET /api/uploader/scan/files-by-directory?tableName=X&directoryPath=Y
104
+ PATCH /api/uploader/scan/batch-update-propagation?tableName=X
105
+ GET /api/uploader/scan/propagation-stats?tableName=X
106
+ ```
107
+
108
+ ## Database Fields Updated
109
+
110
+ | Field | Type | Description |
111
+ |-------|------|-------------|
112
+ | `arela_path` | TEXT | Upload path (RFC/Year/...) |
113
+ | `rfc` | VARCHAR | Tax ID from pedimento |
114
+ | `detected_pedimento_year` | INTEGER | Year from pedimento |
115
+ | `propagated_from_id` | UUID | Reference to source pedimento |
116
+ | `propagation_attempted_at` | TIMESTAMP | When propagation ran |
117
+ | `propagation_attempts` | INTEGER | Number of attempts |
118
+ | `propagation_error` | TEXT | Error if propagation failed |
119
+ | `needs_propagation` | BOOLEAN | Flag for efficient querying |
120
+
121
+ ## Performance Tips
122
+
123
+ - **Large datasets**: Increase batch size to 100-200
124
+ - **Slow propagation**: Check for many small directories vs few large ones
125
+ - **API latency**: Use `--api` flag to select closer API
126
+ - **Memory usage**: Reduce batch size if high memory usage
127
+
128
+ ## Optimization Features
129
+
130
+ ### 1. **Directory-Based Indexing**
131
+ - Uses exact `directory_path` matching (no regex)
132
+ - Leverages indexes for fast lookups
133
+ - Processes one directory at a time
134
+
135
+ ### 2. **Attempt Tracking**
136
+ - Tracks `propagation_attempts` per file
137
+ - Respects `max_propagation_attempts` (default: 3)
138
+ - Skips files that reached max attempts
139
+
140
+ ### 3. **Preparation Phase**
141
+ - Marks files needing propagation before processing
142
+ - Enables efficient batch queries
143
+ - Reduces unnecessary processing
144
+
145
+ ### 4. **Batch Processing**
146
+ - Fetches pedimentos in configurable batches
147
+ - Updates multiple files per API call
148
+ - Real-time progress with throughput metrics
149
+
150
+ ## Troubleshooting
151
+
152
+ | Error | Solution |
153
+ |-------|----------|
154
+ | "Scan configuration errors" | Set ARELA_COMPANY_SLUG and ARELA_SERVER_ID |
155
+ | "Table not found" | Run `arela scan` first |
156
+ | "No files need propagation" | Run `arela identify` first |
157
+ | "Failed to fetch pedimento sources" | Check API connectivity |
158
+
159
+ ## Comparison: Legacy vs Optimized
160
+
161
+ | Feature | Legacy (detect --propagate-arela-path) | New (propagate) |
162
+ |---------|----------------------------------------|-----------------|
163
+ | Table | Global `uploader` | Dynamic `file_stats_*` |
164
+ | API | Supabase direct | Configured API |
165
+ | Query Strategy | Single UPDATE with regex | Directory-based exact match |
166
+ | Progress | Batch count | Real-time throughput |
167
+ | Indexes | Regex-based | Directory + arela_path |
168
+ | Attempt Tracking | No | Yes (max 3 attempts) |
169
+ | Error Handling | Basic | Categorized with tracking |
170
+
171
+ ## Next Steps
172
+
173
+ After propagation:
174
+
175
+ ```bash
176
+ # Upload files by RFC
177
+ arela push
178
+ ```
179
+
180
+ ## Monitoring Queries
181
+
182
+ ```sql
183
+ -- Check propagation progress
184
+ SELECT
185
+ COUNT(*) FILTER (WHERE arela_path IS NOT NULL) as with_arela_path,
186
+ COUNT(*) FILTER (WHERE needs_propagation = TRUE) as needs_propagation,
187
+ COUNT(*) FILTER (
188
+ WHERE arela_path IS NULL
189
+ AND needs_propagation = TRUE
190
+ AND propagation_attempts < max_propagation_attempts
191
+ ) as pending
192
+ FROM cli.file_stats_<company>_<server>_<path>;
193
+
194
+ -- View propagation by directory
195
+ SELECT
196
+ directory_path,
197
+ COUNT(*) as total_files,
198
+ COUNT(*) FILTER (WHERE arela_path IS NOT NULL) as propagated,
199
+ MAX(propagation_attempted_at) as last_attempt
200
+ FROM cli.file_stats_<company>_<server>_<path>
201
+ WHERE needs_propagation = TRUE
202
+ GROUP BY directory_path
203
+ ORDER BY total_files DESC
204
+ LIMIT 20;
205
+
206
+ -- Check propagation errors
207
+ SELECT
208
+ directory_path,
209
+ file_name,
210
+ propagation_error,
211
+ propagation_attempts
212
+ FROM cli.file_stats_<company>_<server>_<path>
213
+ WHERE propagation_error IS NOT NULL
214
+ ORDER BY propagation_attempts DESC
215
+ LIMIT 20;
216
+
217
+ -- Find directories with pedimentos but unpropagated files
218
+ SELECT
219
+ f.directory_path,
220
+ COUNT(*) as unpropagated_files,
221
+ MAX(p.arela_path) as pedimento_arela_path
222
+ FROM cli.file_stats_<company>_<server>_<path> f
223
+ INNER JOIN cli.file_stats_<company>_<server>_<path> p
224
+ ON f.directory_path = p.directory_path
225
+ AND p.detected_type = 'pedimento_simplificado'
226
+ AND p.arela_path IS NOT NULL
227
+ WHERE f.arela_path IS NULL
228
+ AND (f.detected_type IS NULL OR f.detected_type != 'pedimento_simplificado')
229
+ GROUP BY f.directory_path
230
+ ORDER BY unpropagated_files DESC
231
+ LIMIT 20;
232
+ ```
233
+
234
+ ## Reset Propagation
235
+
236
+ ```sql
237
+ -- Reset all propagation for retry
238
+ UPDATE cli.file_stats_<company>_<server>_<path>
239
+ SET
240
+ propagation_attempts = 0,
241
+ propagation_error = NULL,
242
+ propagation_attempted_at = NULL,
243
+ needs_propagation = FALSE
244
+ WHERE arela_path IS NULL;
245
+
246
+ -- Reset specific directory
247
+ UPDATE cli.file_stats_<company>_<server>_<path>
248
+ SET
249
+ propagation_attempts = 0,
250
+ propagation_error = NULL,
251
+ propagation_attempted_at = NULL
252
+ WHERE directory_path LIKE '2024/3019796/%'
253
+ AND arela_path IS NULL;
254
+
255
+ -- Increase max attempts for stubborn files
256
+ UPDATE cli.file_stats_<company>_<server>_<path>
257
+ SET max_propagation_attempts = 5
258
+ WHERE propagation_attempts >= max_propagation_attempts
259
+ AND arela_path IS NULL;
260
+ ```
261
+
262
+ ## Files Involved
263
+
264
+ ### CLI
265
+ - `src/commands/PropagateCommand.js` - Main command
266
+ - `src/services/ScanApiService.js` - API communication
267
+ - `src/config/config.js` - Configuration (reused)
268
+
269
+ ### Backend
270
+ - `src/uploader/services/file-stats-table-manager.service.ts` - Table operations
271
+ - `src/uploader/services/uploader.service.ts` - Business logic
272
+ - `src/uploader/controllers/uploader.controller.ts` - REST endpoints