@arela/uploader 1.0.2 → 1.0.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.env.local +316 -0
- package/.env.template +70 -0
- package/coverage/IdentifyCommand.js.html +1462 -0
- package/coverage/PropagateCommand.js.html +1507 -0
- package/coverage/PushCommand.js.html +1504 -0
- package/coverage/ScanCommand.js.html +1654 -0
- package/coverage/UploadCommand.js.html +1846 -0
- package/coverage/WatchCommand.js.html +4111 -0
- package/coverage/base.css +224 -0
- package/coverage/block-navigation.js +87 -0
- package/coverage/favicon.png +0 -0
- package/coverage/index.html +191 -0
- package/coverage/lcov-report/IdentifyCommand.js.html +1462 -0
- package/coverage/lcov-report/PropagateCommand.js.html +1507 -0
- package/coverage/lcov-report/PushCommand.js.html +1504 -0
- package/coverage/lcov-report/ScanCommand.js.html +1654 -0
- package/coverage/lcov-report/UploadCommand.js.html +1846 -0
- package/coverage/lcov-report/WatchCommand.js.html +4111 -0
- package/coverage/lcov-report/base.css +224 -0
- package/coverage/lcov-report/block-navigation.js +87 -0
- package/coverage/lcov-report/favicon.png +0 -0
- package/coverage/lcov-report/index.html +191 -0
- package/coverage/lcov-report/prettify.css +1 -0
- package/coverage/lcov-report/prettify.js +2 -0
- package/coverage/lcov-report/sort-arrow-sprite.png +0 -0
- package/coverage/lcov-report/sorter.js +210 -0
- package/coverage/lcov.info +1937 -0
- package/coverage/prettify.css +1 -0
- package/coverage/prettify.js +2 -0
- package/coverage/sort-arrow-sprite.png +0 -0
- package/coverage/sorter.js +210 -0
- package/docs/API_RETRY_MECHANISM.md +338 -0
- package/docs/ARELA_IDENTIFY_IMPLEMENTATION.md +489 -0
- package/docs/ARELA_IDENTIFY_QUICKREF.md +186 -0
- package/docs/ARELA_PROPAGATE_IMPLEMENTATION.md +581 -0
- package/docs/ARELA_PROPAGATE_QUICKREF.md +272 -0
- package/docs/ARELA_PUSH_IMPLEMENTATION.md +577 -0
- package/docs/ARELA_PUSH_QUICKREF.md +322 -0
- package/docs/ARELA_SCAN_IMPLEMENTATION.md +373 -0
- package/docs/ARELA_SCAN_QUICKREF.md +139 -0
- package/docs/CROSS_PLATFORM_PATH_HANDLING.md +593 -0
- package/docs/DETECTION_ATTEMPT_TRACKING.md +414 -0
- package/docs/MIGRATION_UPLOADER_TO_FILE_STATS.md +1020 -0
- package/docs/MULTI_LEVEL_DIRECTORY_SCANNING.md +494 -0
- package/docs/STATS_COMMAND_SEQUENCE_DIAGRAM.md +287 -0
- package/docs/STATS_COMMAND_SIMPLE.md +93 -0
- package/package.json +31 -3
- package/src/commands/IdentifyCommand.js +459 -0
- package/src/commands/PropagateCommand.js +474 -0
- package/src/commands/PushCommand.js +473 -0
- package/src/commands/ScanCommand.js +523 -0
- package/src/config/config.js +154 -7
- package/src/file-detection.js +9 -10
- package/src/index.js +150 -0
- package/src/services/ScanApiService.js +645 -0
- package/src/utils/PathNormalizer.js +220 -0
- package/tests/commands/IdentifyCommand.test.js +570 -0
- package/tests/commands/PropagateCommand.test.js +568 -0
- package/tests/commands/PushCommand.test.js +754 -0
- package/tests/commands/ScanCommand.test.js +382 -0
- package/tests/unit/PathAndTableNameGeneration.test.js +1211 -0
|
@@ -0,0 +1,593 @@
|
|
|
1
|
+
# Absolute Path Strategy & SCAN_DIRECTORY_LEVEL Implementation
|
|
2
|
+
|
|
3
|
+
**Updated**: January 19, 2026
|
|
4
|
+
|
|
5
|
+
## Overview
|
|
6
|
+
|
|
7
|
+
The system now uses **absolute paths throughout** for maximum clarity and correctness. This ensures that different relative paths like `../sample` and `./sample` create different tables as expected.
|
|
8
|
+
|
|
9
|
+
## ✅ Simplified Strategy: Use Absolute Paths Everywhere
|
|
10
|
+
|
|
11
|
+
### Key Principle
|
|
12
|
+
**Always resolve paths to absolute paths first, then sanitize for table names.**
|
|
13
|
+
|
|
14
|
+
### Why This Works Better
|
|
15
|
+
|
|
16
|
+
1. **Unambiguous**: `../sample` and `./sample` resolve to different absolute paths
|
|
17
|
+
2. **Simpler**: Same absolute path used for both `base_path_label` and `base_path_full`
|
|
18
|
+
3. **Cross-platform**: Forward slashes in stored paths for consistency
|
|
19
|
+
4. **PostgreSQL-friendly**: Special characters sanitized only in table names
|
|
20
|
+
|
|
21
|
+
### Path Examples
|
|
22
|
+
|
|
23
|
+
```bash
|
|
24
|
+
# From /data/project:
|
|
25
|
+
../sample → /data/sample → scan_palco_local_data_sample
|
|
26
|
+
./sample → /data/project/sample → scan_palco_local_data_project_sample
|
|
27
|
+
|
|
28
|
+
# Windows:
|
|
29
|
+
C:\Users\Docs → C:/Users/Docs (stored) → scan_palco_local_c_users_docs
|
|
30
|
+
|
|
31
|
+
# Linux:
|
|
32
|
+
/home/user → /home/user (stored) → scan_palco_local_home_user
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
## Storage Details
|
|
36
|
+
|
|
37
|
+
### cli_registry Table
|
|
38
|
+
- `company_slug`: Company identifier
|
|
39
|
+
- `server_id`: Server identifier
|
|
40
|
+
- `base_path_full`: Absolute path with forward slashes (`C:/Users/Documents/2023`)
|
|
41
|
+
- `table_name`: Sanitized version (`scan_palco_local_c_users_documents_2023`)
|
|
42
|
+
- **Removed**: `base_path_label` (was redundant - same as base_path_full)
|
|
43
|
+
|
|
44
|
+
### File Records (scan_* tables)
|
|
45
|
+
- `directory_path`: Absolute path with forward slashes
|
|
46
|
+
- `relative_path`: Relative to base with forward slashes
|
|
47
|
+
- `absolute_path`: Absolute path with forward slashes
|
|
48
|
+
|
|
49
|
+
**Note**: All paths stored with forward slashes for consistency, but work on all platforms.
|
|
50
|
+
|
|
51
|
+
## Key Features
|
|
52
|
+
|
|
53
|
+
### 1. Cross-Platform Path Normalization
|
|
54
|
+
|
|
55
|
+
All paths are normalized to POSIX format (forward slashes) for consistency:
|
|
56
|
+
|
|
57
|
+
- **Windows paths**: `C:\Users\Documents` → `/Users/Documents`
|
|
58
|
+
- **Unix paths**: `/home/user/docs` → `/home/user/docs`
|
|
59
|
+
- **Relative paths**: `../parent/folder` → `../parent/folder`
|
|
60
|
+
- **Network drives**: `O:/data/files` → `/data/files`
|
|
61
|
+
|
|
62
|
+
### 2. Full Path in Table Names
|
|
63
|
+
|
|
64
|
+
Table names now include the complete normalized path:
|
|
65
|
+
|
|
66
|
+
```
|
|
67
|
+
Format: scan_<company>_<server>_<normalized_path>
|
|
68
|
+
|
|
69
|
+
Examples:
|
|
70
|
+
- scan_palco_local_data_2023
|
|
71
|
+
- scan_palco_local_users_documents_projects
|
|
72
|
+
- scan_company_server_c_users_admin_files
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
### 3. SCAN_DIRECTORY_LEVEL Support
|
|
76
|
+
|
|
77
|
+
The `SCAN_DIRECTORY_LEVEL` environment variable controls table granularity:
|
|
78
|
+
|
|
79
|
+
- **Level 0** (default): Single table for entire base path
|
|
80
|
+
- **Level 1**: One table per first-level subdirectory
|
|
81
|
+
- **Level 2**: One table per second-level subdirectory
|
|
82
|
+
- **Level N**: One table per N-th level subdirectory
|
|
83
|
+
|
|
84
|
+
Example with `SCAN_DIRECTORY_LEVEL=1` and base path `/data`:
|
|
85
|
+
```
|
|
86
|
+
/data/2023 → scan_palco_local_data_2023
|
|
87
|
+
/data/2024 → scan_palco_local_data_2024
|
|
88
|
+
/data/archive → scan_palco_local_data_archive
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
## Implementation Details
|
|
92
|
+
|
|
93
|
+
### CLI Components
|
|
94
|
+
|
|
95
|
+
#### 1. PathNormalizer Utility (`src/utils/PathNormalizer.js`)
|
|
96
|
+
|
|
97
|
+
Centralized utility for all path operations:
|
|
98
|
+
|
|
99
|
+
```javascript
|
|
100
|
+
// Normalize any path to POSIX format
|
|
101
|
+
const normalized = PathNormalizer.normalizePath('C:\\Users\\Documents');
|
|
102
|
+
// Result: /Users/Documents
|
|
103
|
+
|
|
104
|
+
// Generate table name
|
|
105
|
+
const tableName = PathNormalizer.generateTableName({
|
|
106
|
+
companySlug: 'palco',
|
|
107
|
+
serverId: 'local',
|
|
108
|
+
basePathLabel: '/data/2023'
|
|
109
|
+
});
|
|
110
|
+
// Result: scan_palco_local_data_2023
|
|
111
|
+
|
|
112
|
+
// Build full path label
|
|
113
|
+
const fullPath = PathNormalizer.buildBasePathLabel('/data', '2023/subfolder');
|
|
114
|
+
// Result: /data/2023/subfolder
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
#### 2. ScanCommand Updates
|
|
118
|
+
|
|
119
|
+
**Directory Discovery** ([ScanCommand.js](../src/commands/ScanCommand.js)):
|
|
120
|
+
|
|
121
|
+
```javascript
|
|
122
|
+
async #discoverDirectories(basePath, level) {
|
|
123
|
+
// Level 0: Single entry
|
|
124
|
+
if (level === 0) {
|
|
125
|
+
return sources.map((source) => {
|
|
126
|
+
const sourcePath = source === '.' ? basePath : path.resolve(basePath, source);
|
|
127
|
+
const relativePath = source === '.' ? '' : PathNormalizer.getRelativePath(sourcePath, basePath);
|
|
128
|
+
return { path: sourcePath, label: relativePath };
|
|
129
|
+
});
|
|
130
|
+
}
|
|
131
|
+
|
|
132
|
+
// Level > 0: Discover directories at specified depth
|
|
133
|
+
const levelDirs = await this.#getDirectoriesAtLevel(basePath, level, '');
|
|
134
|
+
// ... combine with sources
|
|
135
|
+
}
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
**Table Registration**:
|
|
139
|
+
|
|
140
|
+
```javascript
|
|
141
|
+
for (const dir of directories) {
|
|
142
|
+
// Build full normalized path label
|
|
143
|
+
const fullPathLabel = PathNormalizer.buildBasePathLabel(
|
|
144
|
+
basePath,
|
|
145
|
+
dir.label || '',
|
|
146
|
+
);
|
|
147
|
+
|
|
148
|
+
const registration = await this.scanApiService.registerInstance({
|
|
149
|
+
companySlug: scanConfig.companySlug,
|
|
150
|
+
serverId: scanConfig.serverId,
|
|
151
|
+
basePathLabel: fullPathLabel, // Full normalized path
|
|
152
|
+
basePathFull: dir.path, // Absolute system path
|
|
153
|
+
});
|
|
154
|
+
}
|
|
155
|
+
```
|
|
156
|
+
|
|
157
|
+
**File Record Normalization**:
|
|
158
|
+
|
|
159
|
+
```javascript
|
|
160
|
+
#normalizeFileRecord(filePath, fileStats, basePath, scanTimestamp) {
|
|
161
|
+
return {
|
|
162
|
+
fileName: path.basename(filePath),
|
|
163
|
+
fileExtension: path.extname(filePath).toLowerCase().replace('.', ''),
|
|
164
|
+
directoryPath: PathNormalizer.normalizePath(path.dirname(filePath)),
|
|
165
|
+
relativePath: PathNormalizer.getRelativePath(filePath, basePath),
|
|
166
|
+
absolutePath: PathNormalizer.normalizePath(filePath),
|
|
167
|
+
sizeBytes: Number(fileStats.size),
|
|
168
|
+
modifiedAt: fileStats.mtime.toISOString(),
|
|
169
|
+
scanTimestamp,
|
|
170
|
+
};
|
|
171
|
+
}
|
|
172
|
+
```
|
|
173
|
+
|
|
174
|
+
#### 3. Config Updates
|
|
175
|
+
|
|
176
|
+
Configuration now uses normalized paths ([config.js](../src/config/config.js)):
|
|
177
|
+
|
|
178
|
+
```javascript
|
|
179
|
+
#loadScanConfig() {
|
|
180
|
+
// Auto-derive basePathLabel from UPLOAD_BASE_PATH
|
|
181
|
+
if (!basePathLabel && process.env.UPLOAD_BASE_PATH) {
|
|
182
|
+
const basePath = process.env.UPLOAD_BASE_PATH;
|
|
183
|
+
// Normalize for consistent table naming across platforms
|
|
184
|
+
basePathLabel = PathNormalizer.normalizePath(basePath);
|
|
185
|
+
}
|
|
186
|
+
|
|
187
|
+
// Generate table name using PathNormalizer
|
|
188
|
+
if (companySlug && serverId && basePathLabel) {
|
|
189
|
+
tableName = PathNormalizer.generateTableName({
|
|
190
|
+
companySlug,
|
|
191
|
+
serverId,
|
|
192
|
+
basePathLabel,
|
|
193
|
+
});
|
|
194
|
+
}
|
|
195
|
+
}
|
|
196
|
+
```
|
|
197
|
+
|
|
198
|
+
### Backend Components
|
|
199
|
+
|
|
200
|
+
#### 1. PathNormalizer Utility (`src/uploader/utils/path-normalizer.util.ts`)
|
|
201
|
+
|
|
202
|
+
TypeScript version matching CLI logic exactly:
|
|
203
|
+
|
|
204
|
+
```typescript
|
|
205
|
+
export class PathNormalizer {
|
|
206
|
+
static normalizePath(inputPath: string): string { /* ... */ }
|
|
207
|
+
static generateTableName(config: {...}): string { /* ... */ }
|
|
208
|
+
static buildBasePathLabel(basePath: string, relativePath: string): string { /* ... */ }
|
|
209
|
+
// ... other utilities
|
|
210
|
+
}
|
|
211
|
+
```
|
|
212
|
+
|
|
213
|
+
#### 2. FileStatsTableManagerService Updates
|
|
214
|
+
|
|
215
|
+
**Table Name Generation** ([file-stats-table-manager.service.ts](../../arela-api/src/uploader/services/file-stats-table-manager.service.ts)):
|
|
216
|
+
|
|
217
|
+
```typescript
|
|
218
|
+
private generateTableName(config: ScanInstanceConfig): string {
|
|
219
|
+
// Delegate to PathNormalizer for consistency with CLI
|
|
220
|
+
return PathNormalizer.generateTableName(config);
|
|
221
|
+
}
|
|
222
|
+
```
|
|
223
|
+
|
|
224
|
+
**Instance Table Lookup**:
|
|
225
|
+
|
|
226
|
+
```typescript
|
|
227
|
+
async getInstanceTables(
|
|
228
|
+
companySlug: string,
|
|
229
|
+
serverId: string,
|
|
230
|
+
basePathLabel: string,
|
|
231
|
+
): Promise<CliRegistry[]> {
|
|
232
|
+
// Normalize the base path
|
|
233
|
+
const normalizedBasePath = PathNormalizer.normalizePath(basePathLabel);
|
|
234
|
+
|
|
235
|
+
// Match all tables for this company/server with base_path_full starting with normalized path
|
|
236
|
+
return this.cliRegistryRepository
|
|
237
|
+
.createQueryBuilder('registry')
|
|
238
|
+
.where('registry.table_name LIKE :pattern', {
|
|
239
|
+
pattern: `${basePattern}%`,
|
|
240
|
+
})
|
|
241
|
+
.andWhere('registry.company_slug = :companySlug', { companySlug })
|
|
242
|
+
.andWhere('registry.server_id = :serverId', { serverId })
|
|
243
|
+
.andWhere(
|
|
244
|
+
'(registry.base_path_full = :basePath OR registry.base_path_full LIKE :basePathPrefix)',
|
|
245
|
+
{
|
|
246
|
+
basePath: normalizedBasePath,
|
|
247
|
+
basePathPrefix: `${normalizedBasePath}/%`,
|
|
248
|
+
},
|
|
249
|
+
)
|
|
250
|
+
.orderBy('registry.table_name', 'ASC')
|
|
251
|
+
.getMany();
|
|
252
|
+
}
|
|
253
|
+
```
|
|
254
|
+
|
|
255
|
+
## Command Flow
|
|
256
|
+
|
|
257
|
+
### 1. Scan Command
|
|
258
|
+
|
|
259
|
+
```bash
|
|
260
|
+
arela scan
|
|
261
|
+
```
|
|
262
|
+
|
|
263
|
+
**Flow**:
|
|
264
|
+
1. Load configuration (company, server, base path)
|
|
265
|
+
2. Normalize base path using PathNormalizer
|
|
266
|
+
3. Discover directories at specified `SCAN_DIRECTORY_LEVEL`
|
|
267
|
+
4. For each directory:
|
|
268
|
+
- Build full normalized path label
|
|
269
|
+
- Register instance (creates/finds table)
|
|
270
|
+
- Scan files with normalized paths
|
|
271
|
+
- Upload batch to API
|
|
272
|
+
|
|
273
|
+
### 2. Identify Command
|
|
274
|
+
|
|
275
|
+
```bash
|
|
276
|
+
arela identify
|
|
277
|
+
```
|
|
278
|
+
|
|
279
|
+
**Flow**:
|
|
280
|
+
1. Load configuration
|
|
281
|
+
2. Fetch all tables using `getInstanceTables()`
|
|
282
|
+
3. For each table:
|
|
283
|
+
- Fetch pending PDFs
|
|
284
|
+
- Detect locally
|
|
285
|
+
- Update with normalized paths
|
|
286
|
+
|
|
287
|
+
### 3. Propagate Command
|
|
288
|
+
|
|
289
|
+
```bash
|
|
290
|
+
arela propagate
|
|
291
|
+
```
|
|
292
|
+
|
|
293
|
+
**Flow**:
|
|
294
|
+
1. Load configuration
|
|
295
|
+
2. Fetch all tables using `getInstanceTables()`
|
|
296
|
+
3. For each table:
|
|
297
|
+
- Mark files needing propagation
|
|
298
|
+
- Process pedimentos by directory
|
|
299
|
+
- Propagate arela_path to same-directory files
|
|
300
|
+
|
|
301
|
+
### 4. Push Command
|
|
302
|
+
|
|
303
|
+
```bash
|
|
304
|
+
arela push
|
|
305
|
+
```
|
|
306
|
+
|
|
307
|
+
**Flow**:
|
|
308
|
+
1. Load configuration
|
|
309
|
+
2. Fetch all tables using `getInstanceTables()`
|
|
310
|
+
3. For each table:
|
|
311
|
+
- Fetch files with arela_path
|
|
312
|
+
- Upload to storage API
|
|
313
|
+
- Update upload status
|
|
314
|
+
|
|
315
|
+
## Configuration Examples
|
|
316
|
+
|
|
317
|
+
### Single-Level Scan
|
|
318
|
+
|
|
319
|
+
```bash
|
|
320
|
+
# .env
|
|
321
|
+
ARELA_COMPANY_SLUG=palco
|
|
322
|
+
ARELA_SERVER_ID=local
|
|
323
|
+
UPLOAD_BASE_PATH=/data
|
|
324
|
+
SCAN_DIRECTORY_LEVEL=0
|
|
325
|
+
|
|
326
|
+
# Result: Single table scan_palco_local_data
|
|
327
|
+
```
|
|
328
|
+
|
|
329
|
+
### Multi-Level Scan (Year-based)
|
|
330
|
+
|
|
331
|
+
```bash
|
|
332
|
+
# .env
|
|
333
|
+
ARELA_COMPANY_SLUG=palco
|
|
334
|
+
ARELA_SERVER_ID=local
|
|
335
|
+
UPLOAD_BASE_PATH=/data
|
|
336
|
+
SCAN_DIRECTORY_LEVEL=1
|
|
337
|
+
|
|
338
|
+
# Directory structure:
|
|
339
|
+
# /data/2023/...
|
|
340
|
+
# /data/2024/...
|
|
341
|
+
# /data/archive/...
|
|
342
|
+
|
|
343
|
+
# Result: Three tables
|
|
344
|
+
# - scan_palco_local_data_2023
|
|
345
|
+
# - scan_palco_local_data_2024
|
|
346
|
+
# - scan_palco_local_data_archive
|
|
347
|
+
```
|
|
348
|
+
|
|
349
|
+
### Multi-Level Scan with Sources
|
|
350
|
+
|
|
351
|
+
```bash
|
|
352
|
+
# .env
|
|
353
|
+
ARELA_COMPANY_SLUG=palco
|
|
354
|
+
ARELA_SERVER_ID=local
|
|
355
|
+
UPLOAD_BASE_PATH=/data
|
|
356
|
+
UPLOAD_SOURCES=documents,images
|
|
357
|
+
SCAN_DIRECTORY_LEVEL=1
|
|
358
|
+
|
|
359
|
+
# Directory structure:
|
|
360
|
+
# /data/2023/documents/...
|
|
361
|
+
# /data/2023/images/...
|
|
362
|
+
# /data/2024/documents/...
|
|
363
|
+
# /data/2024/images/...
|
|
364
|
+
|
|
365
|
+
# Result: Four tables
|
|
366
|
+
# - scan_palco_local_data_2023_documents
|
|
367
|
+
# - scan_palco_local_data_2023_images
|
|
368
|
+
# - scan_palco_local_data_2024_documents
|
|
369
|
+
# - scan_palco_local_data_2024_images
|
|
370
|
+
```
|
|
371
|
+
|
|
372
|
+
## Platform-Specific Handling
|
|
373
|
+
|
|
374
|
+
### Windows
|
|
375
|
+
|
|
376
|
+
**Input Paths**:
|
|
377
|
+
```
|
|
378
|
+
C:\Users\Documents\2023
|
|
379
|
+
D:\Archive\Files
|
|
380
|
+
O:\Network\Share
|
|
381
|
+
```
|
|
382
|
+
|
|
383
|
+
**Normalized Paths**:
|
|
384
|
+
```
|
|
385
|
+
/Users/Documents/2023
|
|
386
|
+
/Archive/Files
|
|
387
|
+
/Network/Share
|
|
388
|
+
```
|
|
389
|
+
|
|
390
|
+
**Table Names**:
|
|
391
|
+
```
|
|
392
|
+
scan_company_server_users_documents_2023
|
|
393
|
+
scan_company_server_archive_files
|
|
394
|
+
scan_company_server_network_share
|
|
395
|
+
```
|
|
396
|
+
|
|
397
|
+
### Linux/macOS
|
|
398
|
+
|
|
399
|
+
**Input Paths**:
|
|
400
|
+
```
|
|
401
|
+
/home/user/documents/2023
|
|
402
|
+
/mnt/storage/archive
|
|
403
|
+
/Volumes/backup/files
|
|
404
|
+
```
|
|
405
|
+
|
|
406
|
+
**Normalized Paths** (unchanged):
|
|
407
|
+
```
|
|
408
|
+
/home/user/documents/2023
|
|
409
|
+
/mnt/storage/archive
|
|
410
|
+
/Volumes/backup/files
|
|
411
|
+
```
|
|
412
|
+
|
|
413
|
+
**Table Names**:
|
|
414
|
+
```
|
|
415
|
+
scan_company_server_home_user_documents_2023
|
|
416
|
+
scan_company_server_mnt_storage_archive
|
|
417
|
+
scan_company_server_volumes_backup_files
|
|
418
|
+
```
|
|
419
|
+
|
|
420
|
+
## Database Schema
|
|
421
|
+
|
|
422
|
+
### cli_registry Table
|
|
423
|
+
|
|
424
|
+
Stores metadata for each scan instance:
|
|
425
|
+
|
|
426
|
+
```sql
|
|
427
|
+
CREATE TABLE cli.cli_registry (
|
|
428
|
+
id UUID PRIMARY KEY,
|
|
429
|
+
table_name VARCHAR(63) UNIQUE NOT NULL,
|
|
430
|
+
company_slug VARCHAR(255) NOT NULL,
|
|
431
|
+
server_id VARCHAR(255) NOT NULL,
|
|
432
|
+
base_path_label TEXT NOT NULL, -- Normalized path (e.g., /data/2023)
|
|
433
|
+
base_path_full TEXT NOT NULL, -- Original absolute path (e.g., C:\data\2023)
|
|
434
|
+
status VARCHAR(50) DEFAULT 'active',
|
|
435
|
+
created_at TIMESTAMP DEFAULT NOW(),
|
|
436
|
+
last_scan_at TIMESTAMP,
|
|
437
|
+
total_files INTEGER DEFAULT 0,
|
|
438
|
+
total_size_bytes BIGINT DEFAULT 0
|
|
439
|
+
);
|
|
440
|
+
|
|
441
|
+
-- Index for efficient lookup
|
|
442
|
+
CREATE INDEX idx_cli_registry_lookup
|
|
443
|
+
ON cli.cli_registry(company_slug, server_id, base_path_label);
|
|
444
|
+
```
|
|
445
|
+
|
|
446
|
+
### scan_* Tables
|
|
447
|
+
|
|
448
|
+
Dynamic tables created per instance:
|
|
449
|
+
|
|
450
|
+
```sql
|
|
451
|
+
CREATE TABLE cli.scan_palco_local_data_2023 (
|
|
452
|
+
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
453
|
+
file_name VARCHAR(255) NOT NULL,
|
|
454
|
+
file_extension VARCHAR(50),
|
|
455
|
+
directory_path TEXT NOT NULL, -- Normalized directory path
|
|
456
|
+
relative_path TEXT NOT NULL, -- Normalized relative path
|
|
457
|
+
absolute_path TEXT NOT NULL, -- Normalized absolute path
|
|
458
|
+
size_bytes BIGINT NOT NULL,
|
|
459
|
+
modified_at TIMESTAMP NOT NULL,
|
|
460
|
+
scan_timestamp TIMESTAMP NOT NULL,
|
|
461
|
+
|
|
462
|
+
-- Detection fields
|
|
463
|
+
detected_type VARCHAR(100),
|
|
464
|
+
detected_pedimento VARCHAR(50),
|
|
465
|
+
detected_pedimento_year INTEGER,
|
|
466
|
+
rfc VARCHAR(20),
|
|
467
|
+
arela_path TEXT,
|
|
468
|
+
detection_error TEXT,
|
|
469
|
+
detection_attempts INTEGER DEFAULT 0,
|
|
470
|
+
is_not_pedimento BOOLEAN DEFAULT FALSE,
|
|
471
|
+
|
|
472
|
+
-- Propagation fields
|
|
473
|
+
propagated_from_id UUID,
|
|
474
|
+
propagation_error TEXT,
|
|
475
|
+
propagation_attempts INTEGER DEFAULT 0,
|
|
476
|
+
needs_propagation BOOLEAN DEFAULT FALSE,
|
|
477
|
+
|
|
478
|
+
-- Upload fields
|
|
479
|
+
uploaded BOOLEAN DEFAULT FALSE,
|
|
480
|
+
upload_path TEXT,
|
|
481
|
+
uploaded_to_storage_id TEXT,
|
|
482
|
+
upload_error TEXT,
|
|
483
|
+
upload_attempts INTEGER DEFAULT 0
|
|
484
|
+
);
|
|
485
|
+
```
|
|
486
|
+
|
|
487
|
+
## Benefits
|
|
488
|
+
|
|
489
|
+
### 1. Cross-Platform Consistency
|
|
490
|
+
|
|
491
|
+
- Same table names regardless of OS
|
|
492
|
+
- Paths stored consistently in database
|
|
493
|
+
- No special handling needed per platform
|
|
494
|
+
|
|
495
|
+
### 2. Full Path Visibility
|
|
496
|
+
|
|
497
|
+
- Table names clearly show what directory they represent
|
|
498
|
+
- Easy to identify and manage specific scans
|
|
499
|
+
- Better organization for multi-directory setups
|
|
500
|
+
|
|
501
|
+
### 3. Scalability
|
|
502
|
+
|
|
503
|
+
- Support for any directory structure
|
|
504
|
+
- Configurable granularity via `SCAN_DIRECTORY_LEVEL`
|
|
505
|
+
- Efficient batch processing per directory
|
|
506
|
+
|
|
507
|
+
### 4. Maintainability
|
|
508
|
+
|
|
509
|
+
- Centralized path logic in PathNormalizer
|
|
510
|
+
- Consistent behavior between CLI and backend
|
|
511
|
+
- Easy to test and debug
|
|
512
|
+
|
|
513
|
+
## Migration Guide
|
|
514
|
+
|
|
515
|
+
### For Existing Installations
|
|
516
|
+
|
|
517
|
+
If you have existing scan tables with old naming:
|
|
518
|
+
|
|
519
|
+
1. **New scans will create new tables** with the updated naming scheme
|
|
520
|
+
2. **Old tables remain functional** but won't be automatically migrated
|
|
521
|
+
3. **To migrate**:
|
|
522
|
+
- Re-run `arela scan` to create new tables
|
|
523
|
+
- Use `identify`, `propagate`, `push` on new tables
|
|
524
|
+
- Optionally drop old tables when confident
|
|
525
|
+
|
|
526
|
+
### Backward Compatibility
|
|
527
|
+
|
|
528
|
+
The system is designed to be backward compatible:
|
|
529
|
+
|
|
530
|
+
- `getInstanceTables()` matches by `base_path_full` pattern
|
|
531
|
+
- Old tables can coexist with new tables
|
|
532
|
+
- Commands process all matching tables
|
|
533
|
+
|
|
534
|
+
## Troubleshooting
|
|
535
|
+
|
|
536
|
+
### Issue: Tables not found
|
|
537
|
+
|
|
538
|
+
**Cause**: Base path mismatch between scan and other commands
|
|
539
|
+
|
|
540
|
+
**Solution**: Ensure `UPLOAD_BASE_PATH` is consistent across commands
|
|
541
|
+
|
|
542
|
+
### Issue: Duplicate tables
|
|
543
|
+
|
|
544
|
+
**Cause**: Different path formats creating separate tables
|
|
545
|
+
|
|
546
|
+
**Solution**: System now prevents this with PathNormalizer
|
|
547
|
+
|
|
548
|
+
### Issue: Table name too long
|
|
549
|
+
|
|
550
|
+
**Cause**: Very deep directory structures
|
|
551
|
+
|
|
552
|
+
**Solution**: System automatically truncates and adds hash for uniqueness
|
|
553
|
+
|
|
554
|
+
## Testing
|
|
555
|
+
|
|
556
|
+
### Manual Testing
|
|
557
|
+
|
|
558
|
+
```bash
|
|
559
|
+
# Test different path formats
|
|
560
|
+
UPLOAD_BASE_PATH="C:\Users\Documents" arela scan # Windows
|
|
561
|
+
UPLOAD_BASE_PATH="/Users/Documents" arela scan # Unix
|
|
562
|
+
UPLOAD_BASE_PATH="O:/Data/Files" arela scan # Network
|
|
563
|
+
|
|
564
|
+
# All should create the same table name (normalized)
|
|
565
|
+
```
|
|
566
|
+
|
|
567
|
+
### Unit Testing
|
|
568
|
+
|
|
569
|
+
Tests are provided in [tests/PathNormalizer.test.js](../tests/PathNormalizer.test.js):
|
|
570
|
+
|
|
571
|
+
```bash
|
|
572
|
+
npm test -- PathNormalizer.test.js
|
|
573
|
+
```
|
|
574
|
+
|
|
575
|
+
## Future Enhancements
|
|
576
|
+
|
|
577
|
+
1. **Table Consolidation**: Utility to merge data from old tables to new naming scheme
|
|
578
|
+
2. **Path Aliases**: Support for custom path labels independent of physical path
|
|
579
|
+
3. **Virtual Directories**: Scan multiple non-contiguous directories into one table
|
|
580
|
+
4. **Path Filters**: Regex-based filtering at path level
|
|
581
|
+
|
|
582
|
+
## Summary
|
|
583
|
+
|
|
584
|
+
The cross-platform path handling system provides:
|
|
585
|
+
|
|
586
|
+
✅ Consistent table naming across Windows, Linux, and macOS
|
|
587
|
+
✅ Full path information in table names
|
|
588
|
+
✅ Proper support for `SCAN_DIRECTORY_LEVEL`
|
|
589
|
+
✅ All commands (scan, identify, propagate, push) working seamlessly
|
|
590
|
+
✅ Normalized paths in database for consistency
|
|
591
|
+
✅ Centralized path logic for maintainability
|
|
592
|
+
|
|
593
|
+
All path operations are now platform-independent and maintain consistency between CLI and backend implementations.
|