@tfw.in/structura-lib 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (95) hide show
  1. package/PRODUCTION_ARCHITECTURE.md +511 -0
  2. package/README.md +379 -0
  3. package/SAVE_FUNCTIONALITY_COMPLETE.md +448 -0
  4. package/dist/cjs/EditableContent.js +150 -0
  5. package/dist/cjs/HtmlViewer.js +587 -0
  6. package/dist/cjs/PdfComponents.js +16 -0
  7. package/dist/cjs/PdfDocumentViewer.js +281 -0
  8. package/dist/cjs/Structura.js +806 -0
  9. package/dist/cjs/Table.js +164 -0
  10. package/dist/cjs/TableCell.js +115 -0
  11. package/dist/cjs/accuracyMetrics.js +39 -0
  12. package/dist/cjs/helpers/preprocessData.js +143 -0
  13. package/dist/cjs/index.js +7 -0
  14. package/dist/cjs/lib/polyfills.js +15 -0
  15. package/dist/cjs/lib/utils.js +10 -0
  16. package/dist/cjs/node_modules/react-icons/fa/index.esm.js +14 -0
  17. package/dist/cjs/node_modules/react-icons/lib/esm/iconBase.js +69 -0
  18. package/dist/cjs/node_modules/react-icons/lib/esm/iconContext.js +15 -0
  19. package/dist/cjs/polyfills.js +19 -0
  20. package/dist/cjs/route.js +102 -0
  21. package/dist/cjs/styles.css +7 -0
  22. package/dist/cjs/styles.css.map +1 -0
  23. package/dist/cjs/ui/badge.js +34 -0
  24. package/dist/cjs/ui/button.js +71 -0
  25. package/dist/cjs/ui/card.js +86 -0
  26. package/dist/cjs/ui/progress.js +45 -0
  27. package/dist/cjs/ui/scroll-area.js +62 -0
  28. package/dist/cjs/ui/tabs.js +60 -0
  29. package/dist/cjs/worker.js +36 -0
  30. package/dist/esm/EditableContent.js +161 -0
  31. package/dist/esm/HtmlViewer.js +640 -0
  32. package/dist/esm/PdfComponents.js +21 -0
  33. package/dist/esm/PdfDocumentViewer.js +294 -0
  34. package/dist/esm/Structura.js +951 -0
  35. package/dist/esm/Table.js +182 -0
  36. package/dist/esm/TableCell.js +122 -0
  37. package/dist/esm/_virtual/_rollupPluginBabelHelpers.js +305 -0
  38. package/dist/esm/accuracyMetrics.js +41 -0
  39. package/dist/esm/helpers/preprocessData.js +152 -0
  40. package/dist/esm/index.js +1 -0
  41. package/dist/esm/lib/polyfills.js +13 -0
  42. package/dist/esm/lib/utils.js +8 -0
  43. package/dist/esm/node_modules/react-icons/fa/index.esm.js +11 -0
  44. package/dist/esm/node_modules/react-icons/lib/esm/iconBase.js +66 -0
  45. package/dist/esm/node_modules/react-icons/lib/esm/iconContext.js +12 -0
  46. package/dist/esm/polyfills.js +17 -0
  47. package/dist/esm/route.js +154 -0
  48. package/dist/esm/styles.css +7 -0
  49. package/dist/esm/styles.css.map +1 -0
  50. package/dist/esm/types/EditableContent.d.ts +9 -0
  51. package/dist/esm/types/HtmlViewer.d.ts +10 -0
  52. package/dist/esm/types/PdfComponents.d.ts +35 -0
  53. package/dist/esm/types/PdfDocumentViewer.d.ts +22 -0
  54. package/dist/esm/types/Structura.d.ts +11 -0
  55. package/dist/esm/types/Table.d.ts +12 -0
  56. package/dist/esm/types/TableCell.d.ts +13 -0
  57. package/dist/esm/types/accuracy.d.ts +23 -0
  58. package/dist/esm/types/accuracyMetrics.d.ts +5 -0
  59. package/dist/esm/types/helpers/flattenJSON.d.ts +1 -0
  60. package/dist/esm/types/helpers/hardMerging.d.ts +2 -0
  61. package/dist/esm/types/helpers/index.d.ts +6 -0
  62. package/dist/esm/types/helpers/jsonToHtml.d.ts +40 -0
  63. package/dist/esm/types/helpers/preprocessData.d.ts +3 -0
  64. package/dist/esm/types/helpers/removeMetadata.d.ts +1 -0
  65. package/dist/esm/types/helpers/tableProcessor.d.ts +1 -0
  66. package/dist/esm/types/index.d.ts +3 -0
  67. package/dist/esm/types/lib/polyfills.d.ts +1 -0
  68. package/dist/esm/types/lib/utils.d.ts +2 -0
  69. package/dist/esm/types/polyfills.d.ts +1 -0
  70. package/dist/esm/types/route.d.ts +45 -0
  71. package/dist/esm/types/test-app/src/App.d.ts +4 -0
  72. package/dist/esm/types/test-app/src/main.d.ts +1 -0
  73. package/dist/esm/types/test-app/vite.config.d.ts +2 -0
  74. package/dist/esm/types/types.d.ts +23 -0
  75. package/dist/esm/types/ui/alert.d.ts +8 -0
  76. package/dist/esm/types/ui/badge.d.ts +9 -0
  77. package/dist/esm/types/ui/button.d.ts +11 -0
  78. package/dist/esm/types/ui/card.d.ts +8 -0
  79. package/dist/esm/types/ui/progress.d.ts +6 -0
  80. package/dist/esm/types/ui/scroll-area.d.ts +5 -0
  81. package/dist/esm/types/ui/skeleton.d.ts +2 -0
  82. package/dist/esm/types/ui/tabs.d.ts +7 -0
  83. package/dist/esm/types/worker.d.ts +1 -0
  84. package/dist/esm/ui/badge.js +31 -0
  85. package/dist/esm/ui/button.js +50 -0
  86. package/dist/esm/ui/card.js +67 -0
  87. package/dist/esm/ui/progress.js +26 -0
  88. package/dist/esm/ui/scroll-area.js +45 -0
  89. package/dist/esm/ui/tabs.js +39 -0
  90. package/dist/esm/worker.js +50 -0
  91. package/dist/index.d.ts +38 -0
  92. package/package.json +85 -0
  93. package/server/README.md +203 -0
  94. package/server/db.js +142 -0
  95. package/server/server.js +165 -0
@@ -0,0 +1,511 @@
1
+ # Production Architecture - Structura Edit System
2
+
3
+ ## Overview
4
+
5
+ A scalable, production-ready system for editing structured documents with persistent storage and version history.
6
+
7
+ ## Design Principles
8
+
9
+ 1. **Auto-initialization**: Documents are created automatically on first save
10
+ 2. **Version tracking**: Every edit is stored with full history
11
+ 3. **Baseline preservation**: Original JSON is preserved as the baseline for diffs
12
+ 4. **Scalability**: Works with any PDF/JSON combination without pre-seeding
13
+ 5. **Zero configuration**: No manual database setup required
14
+
15
+ ## Architecture
16
+
17
+ ```
18
+ ┌──────────────────────┐
19
+ │ Client Browser │
20
+ │ │
21
+ │ 1. Load PDF + JSON │ ──────┐
22
+ │ 2. User edits │ │
23
+ │ 3. Click Save │ │
24
+ └──────────────────────┘ │
25
+ │ │
26
+ │ HTTP POST │ Initial Load
27
+ │ /api/save │ (from file)
28
+ ▼ │
29
+ ┌──────────────────────┐ │
30
+ │ Express Server │ │
31
+ │ (port 3002) │◄──────┘
32
+ │ │
33
+ │ Smart Save Logic: │
34
+ │ - First save? │
35
+ │ Store original │
36
+ │ - Subsequent? │
37
+ │ Store edit only │
38
+ └──────────────────────┘
39
+
40
+ │ SQL Insert
41
+
42
+ ┌──────────────────────┐
43
+ │ SQLite Database │
44
+ │ (edits.db) │
45
+ │ │
46
+ │ documents table │
47
+ │ - pdf_name (key) │
48
+ │ - original_json │
49
+ │ │
50
+ │ edits table │
51
+ │ - document_id │
52
+ │ - edited_json │
53
+ │ - edit_summary │
54
+ │ - created_at │
55
+ └──────────────────────┘
56
+ ```
57
+
58
+ ## Data Flow
59
+
60
+ ### First Save (Document Creation)
61
+
62
+ ```
63
+ User loads: ?pdf=/doc.pdf&json=/data.json
64
+
65
+ ├─ Frontend loads JSON from file
66
+ │ - Stores as mockData (for display)
67
+ │ - Stores as originalData (for save)
68
+
69
+ User makes edits
70
+
71
+ └─ User clicks Save
72
+
73
+
74
+ Frontend sends:
75
+ {
76
+ pdfName: "doc.pdf",
77
+ editedJson: {...}, // Current state
78
+ originalJson: {...} // Baseline
79
+ }
80
+
81
+
82
+ Backend:
83
+ - Creates document with originalJson
84
+ - Creates first edit with editedJson
85
+ - Returns documentId + editId
86
+
87
+
88
+ Frontend:
89
+ - Clears originalData (already saved)
90
+ - Shows success message
91
+ ```
92
+
93
+ ### Subsequent Saves (Edit Updates)
94
+
95
+ ```
96
+ User makes more edits
97
+
98
+ └─ User clicks Save
99
+
100
+
101
+ Frontend sends:
102
+ {
103
+ pdfName: "doc.pdf",
104
+ editedJson: {...} // New state
105
+ // No originalJson
106
+ }
107
+
108
+
109
+ Backend:
110
+ - Finds existing document
111
+ - Creates new edit entry
112
+ - Returns documentId + editId
113
+ ```
114
+
115
+ ## API Design
116
+
117
+ ### POST /api/save
118
+
119
+ **Request:**
120
+ ```json
121
+ {
122
+ "pdfName": "document.pdf",
123
+ "editedJson": { /* current state */ },
124
+ "originalJson": { /* baseline (first save only) */ },
125
+ "summary": "Optional description"
126
+ }
127
+ ```
128
+
129
+ **Logic:**
130
+ 1. Check if document exists (by pdfName)
131
+ 2. If new:
132
+ - Create document with originalJson (or editedJson as fallback)
133
+ - Create first edit with editedJson
134
+ 3. If existing:
135
+ - Create new edit with editedJson
136
+ 4. Return documentId and editId
137
+
138
+ **Response:**
139
+ ```json
140
+ {
141
+ "success": true,
142
+ "documentId": 123,
143
+ "editId": 456,
144
+ "message": "Document saved successfully"
145
+ }
146
+ ```
147
+
148
+ ### GET /api/load/:pdfName
149
+
150
+ **Response:**
151
+ ```json
152
+ {
153
+ "success": true,
154
+ "document": {
155
+ "id": 123,
156
+ "pdfName": "document.pdf",
157
+ "originalJson": { /* baseline */ },
158
+ "currentJson": { /* latest state */ },
159
+ "latestEdit": {
160
+ "id": 456,
161
+ "edit_summary": "Edit via UI",
162
+ "created_at": "2025-11-18 10:00:00"
163
+ }
164
+ }
165
+ }
166
+ ```
167
+
168
+ ### GET /api/history/:pdfName
169
+
170
+ **Response:**
171
+ ```json
172
+ {
173
+ "success": true,
174
+ "document": { "id": 123, "pdfName": "document.pdf" },
175
+ "history": [
176
+ {
177
+ "id": 456,
178
+ "edited_json": { /* state */ },
179
+ "edit_summary": "Edit via UI",
180
+ "created_at": "2025-11-18 10:00:00"
181
+ },
182
+ // ... older edits
183
+ ]
184
+ }
185
+ ```
186
+
187
+ ## Database Schema
188
+
189
+ ### documents table
190
+ ```sql
191
+ CREATE TABLE documents (
192
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
193
+ pdf_name TEXT NOT NULL, -- Unique identifier
194
+ original_json TEXT NOT NULL, -- Baseline for diffs
195
+ created_at DATETIME DEFAULT CURRENT_TIMESTAMP
196
+ );
197
+
198
+ CREATE INDEX idx_document_pdf ON documents(pdf_name);
199
+ ```
200
+
201
+ ### edits table
202
+ ```sql
203
+ CREATE TABLE edits (
204
+ id INTEGER PRIMARY KEY AUTOINCREMENT,
205
+ document_id INTEGER NOT NULL,
206
+ edited_json TEXT NOT NULL, -- Full state at this edit
207
+ edit_summary TEXT, -- Optional description
208
+ created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
209
+ FOREIGN KEY (document_id) REFERENCES documents(id)
210
+ );
211
+
212
+ CREATE INDEX idx_edits_document ON edits(document_id);
213
+ ```
214
+
215
+ ## Scalability Features
216
+
217
+ ### 1. Automatic Document Creation
218
+ - No pre-seeding required
219
+ - Works with any PDF/JSON combination
220
+ - Document is created on first save
221
+
222
+ ### 2. Version History
223
+ - Every edit is stored completely
224
+ - No data loss
225
+ - Can restore any previous version
226
+
227
+ ### 3. Efficient Storage
228
+ - Original JSON stored once in documents table
229
+ - Edits stored in separate table
230
+ - Indexed for fast lookups
231
+
232
+ ### 4. Query Patterns
233
+
234
+ **Get latest state:**
235
+ ```javascript
236
+ const doc = getDocumentWithLatestEdit(pdfName);
237
+ // Returns: original + latest edit
238
+ ```
239
+
240
+ **Get full history:**
241
+ ```javascript
242
+ const history = getEditHistory(documentId);
243
+ // Returns: all edits ordered by timestamp
244
+ ```
245
+
246
+ **Restore previous version:**
247
+ ```javascript
248
+ const edit = getEdit(editId);
249
+ // Use edit.edited_json to restore
250
+ ```
251
+
252
+ ## Production Deployment
253
+
254
+ ### Environment Variables
255
+
256
+ ```bash
257
+ # Server configuration
258
+ PORT=3002 # Server port
259
+ NODE_ENV=production # Environment
260
+
261
+ # Database configuration
262
+ DATABASE_PATH=./server/edits.db # SQLite path
263
+ MAX_JSON_SIZE=50mb # Request body limit
264
+ ```
265
+
266
+ ### Process Management
267
+
268
+ ```bash
269
+ # Development
270
+ npm run server # Basic server
271
+ npm run server:dev # With nodemon
272
+
273
+ # Production (recommended)
274
+ pm2 start server/server.js --name structura-edit
275
+ pm2 save
276
+ pm2 startup
277
+ ```
278
+
279
+ ### Nginx Configuration
280
+
281
+ ```nginx
282
+ server {
283
+ listen 80;
284
+ server_name your-domain.com;
285
+
286
+ # API endpoints
287
+ location /api/ {
288
+ proxy_pass http://localhost:3002;
289
+ proxy_http_version 1.1;
290
+ proxy_set_header Upgrade $http_upgrade;
291
+ proxy_set_header Connection 'upgrade';
292
+ proxy_set_header Host $host;
293
+ proxy_cache_bypass $http_upgrade;
294
+ client_max_body_size 50M; # Match MAX_JSON_SIZE
295
+ }
296
+
297
+ # Frontend
298
+ location / {
299
+ proxy_pass http://localhost:5175;
300
+ # ... similar proxy settings
301
+ }
302
+ }
303
+ ```
304
+
305
+ ## Usage Patterns
306
+
307
+ ### Pattern 1: Direct File Load (Current)
308
+
309
+ ```typescript
310
+ // Load from file system
311
+ const url = 'http://localhost:5175/?pdf=/doc.pdf&json=/data.json';
312
+
313
+ // First save:
314
+ // - Frontend sends originalJson + editedJson
315
+ // - Backend creates document + first edit
316
+
317
+ // Subsequent saves:
318
+ // - Frontend sends only editedJson
319
+ // - Backend creates new edit
320
+ ```
321
+
322
+ ### Pattern 2: API-Generated JSON
323
+
324
+ ```typescript
325
+ // Load from API
326
+ const url = 'http://localhost:5175/?pdf=/doc.pdf';
327
+ // API processes PDF, returns JSON
328
+ // Store in mockData + originalData
329
+
330
+ // Save flow same as Pattern 1
331
+ ```
332
+
333
+ ### Pattern 3: Resume Previous Session
334
+
335
+ ```typescript
336
+ // Load from database
337
+ fetch('/api/load/doc.pdf')
338
+ .then(res => res.json())
339
+ .then(data => {
340
+ // Use data.document.currentJson
341
+ // Continue editing from last state
342
+ });
343
+ ```
344
+
345
+ ## Monitoring & Observability
346
+
347
+ ### Logging
348
+
349
+ Server logs include:
350
+ - Document creation: `[Server] Created new document for {pdfName}`
351
+ - Edit saves: `[Server] Saved edit {editId} for document {documentId}`
352
+ - Errors: `[Server] Error saving document: {details}`
353
+
354
+ ### Metrics to Track
355
+
356
+ 1. **Save Operations**
357
+ - New documents created per day
358
+ - Edits saved per document
359
+ - Average time between edits
360
+
361
+ 2. **Database Health**
362
+ - Database size growth
363
+ - Query performance
364
+ - Index usage
365
+
366
+ 3. **API Performance**
367
+ - Request latency
368
+ - Error rates
369
+ - Payload sizes
370
+
371
+ ## Security Considerations
372
+
373
+ ### Current (Development)
374
+
375
+ - CORS enabled for all origins
376
+ - No authentication
377
+ - No rate limiting
378
+ - SQLite file-based storage
379
+
380
+ ### Production Recommendations
381
+
382
+ 1. **Authentication**
383
+ ```javascript
384
+ app.use('/api', authenticateToken);
385
+ ```
386
+
387
+ 2. **Authorization**
388
+ ```javascript
389
+ // Store user_id in documents table
390
+ // Verify ownership before save/load
391
+ ```
392
+
393
+ 3. **Rate Limiting**
394
+ ```javascript
395
+ const rateLimit = require('express-rate-limit');
396
+ app.use('/api/save', rateLimit({
397
+ windowMs: 15 * 60 * 1000,
398
+ max: 100
399
+ }));
400
+ ```
401
+
402
+ 4. **Input Validation**
403
+ ```javascript
404
+ const { body, validationResult } = require('express-validator');
405
+ app.post('/api/save',
406
+ body('pdfName').isString().trim().escape(),
407
+ body('editedJson').isObject(),
408
+ // ... validate
409
+ );
410
+ ```
411
+
412
+ 5. **Database Migration**
413
+ - Consider PostgreSQL for production
414
+ - Supports concurrent writes
415
+ - Better for multi-user scenarios
416
+
417
+ ## Testing
418
+
419
+ ### Unit Tests
420
+
421
+ ```javascript
422
+ // Test document creation
423
+ const docId = saveDocument('test.pdf', originalData);
424
+ expect(docId).toBeGreaterThan(0);
425
+
426
+ // Test edit saves
427
+ const editId = saveEdit(docId, editedData, 'Test edit');
428
+ expect(editId).toBeGreaterThan(0);
429
+
430
+ // Test retrieval
431
+ const doc = getDocumentWithLatestEdit('test.pdf');
432
+ expect(doc.current_json).toEqual(editedData);
433
+ ```
434
+
435
+ ### Integration Tests
436
+
437
+ ```bash
438
+ # Test save endpoint
439
+ curl -X POST http://localhost:3002/api/save \
440
+ -H "Content-Type: application/json" \
441
+ -d @test-data.json
442
+
443
+ # Test load endpoint
444
+ curl http://localhost:3002/api/load/test.pdf
445
+
446
+ # Test history
447
+ curl http://localhost:3002/api/history/test.pdf
448
+ ```
449
+
450
+ ## Backup & Recovery
451
+
452
+ ### Backup Strategy
453
+
454
+ ```bash
455
+ # Automated backups
456
+ 0 2 * * * cp /path/to/edits.db /backups/edits-$(date +\%Y\%m\%d).db
457
+
458
+ # Retention: 30 days
459
+ find /backups -name "edits-*.db" -mtime +30 -delete
460
+ ```
461
+
462
+ ### Recovery
463
+
464
+ ```bash
465
+ # Restore from backup
466
+ cp /backups/edits-20251118.db /path/to/edits.db
467
+ pm2 restart structura-edit
468
+ ```
469
+
470
+ ## Future Enhancements
471
+
472
+ 1. **Real-time Collaboration**
473
+ - WebSocket for live updates
474
+ - Operational transform for conflict resolution
475
+
476
+ 2. **Diff View**
477
+ - Compare original vs current
478
+ - Highlight changes per edit
479
+
480
+ 3. **Undo/Redo**
481
+ - Restore previous edit states
482
+ - Apply/revert specific edits
483
+
484
+ 4. **Export Options**
485
+ - Export to PDF with changes
486
+ - Export diff report
487
+ - Export full history
488
+
489
+ 5. **Search & Filter**
490
+ - Search within documents
491
+ - Filter by edit date/user
492
+ - Full-text search in SQLite
493
+
494
+ ## Status
495
+
496
+ ✅ **Production-Ready Features:**
497
+ - Automatic document creation
498
+ - Version history tracking
499
+ - Scalable for any PDF/JSON
500
+ - RESTful API design
501
+ - SQLite storage
502
+ - Error handling
503
+ - Logging
504
+
505
+ 🔄 **TODO for Production:**
506
+ - Add authentication
507
+ - Add rate limiting
508
+ - Migrate to PostgreSQL
509
+ - Add monitoring/metrics
510
+ - Add automated backups
511
+ - Write comprehensive tests