@fluentcommerce/fc-connect-sdk 0.1.54 → 0.1.55

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (475) hide show
  1. package/CHANGELOG.md +12 -0
  2. package/dist/cjs/clients/fluent-client.js +13 -6
  3. package/dist/cjs/utils/pagination-helpers.js +38 -2
  4. package/dist/cjs/versori/fluent-versori-client.js +11 -5
  5. package/dist/esm/clients/fluent-client.js +13 -6
  6. package/dist/esm/utils/pagination-helpers.js +38 -2
  7. package/dist/esm/versori/fluent-versori-client.js +11 -5
  8. package/dist/tsconfig.esm.tsbuildinfo +1 -1
  9. package/dist/tsconfig.tsbuildinfo +1 -1
  10. package/dist/tsconfig.types.tsbuildinfo +1 -1
  11. package/docs/00-START-HERE/EXPORT-VALIDATION.md +158 -158
  12. package/docs/00-START-HERE/cli-analyze-source-structure-guide.md +655 -655
  13. package/docs/00-START-HERE/cli-documentation-index.md +202 -202
  14. package/docs/00-START-HERE/cli-quick-reference.md +252 -252
  15. package/docs/00-START-HERE/decision-tree.md +552 -552
  16. package/docs/00-START-HERE/getting-started.md +1070 -1070
  17. package/docs/00-START-HERE/mapper-quick-decision-guide.md +235 -235
  18. package/docs/00-START-HERE/readme.md +237 -237
  19. package/docs/00-START-HERE/retailerid-configuration.md +404 -404
  20. package/docs/00-START-HERE/sdk-philosophy.md +794 -794
  21. package/docs/00-START-HERE/troubleshooting-quick-reference.md +1086 -1086
  22. package/docs/01-TEMPLATES/faq.md +686 -686
  23. package/docs/01-TEMPLATES/patterns/pattern-templates-guide.md +68 -68
  24. package/docs/01-TEMPLATES/patterns/patterns-csv-schema-validation-and-rejection-report.md +233 -233
  25. package/docs/01-TEMPLATES/patterns/patterns-custom-resolvers.md +407 -407
  26. package/docs/01-TEMPLATES/patterns/patterns-error-handling-retry.md +511 -511
  27. package/docs/01-TEMPLATES/patterns/patterns-field-mapping-universal.md +701 -701
  28. package/docs/01-TEMPLATES/patterns/patterns-large-file-splitting.md +1430 -1430
  29. package/docs/01-TEMPLATES/patterns/patterns-master-data-etl.md +2399 -2399
  30. package/docs/01-TEMPLATES/patterns/patterns-pagination-streaming.md +447 -447
  31. package/docs/01-TEMPLATES/patterns/patterns-state-duplicate-prevention.md +385 -385
  32. package/docs/01-TEMPLATES/readme.md +957 -957
  33. package/docs/01-TEMPLATES/standalone/standalone-asn-inbound-processing.md +1209 -1209
  34. package/docs/01-TEMPLATES/standalone/standalone-graphql-query-export.md +1140 -1140
  35. package/docs/01-TEMPLATES/standalone/standalone-graphql-to-parquet-partitioned-s3.md +432 -432
  36. package/docs/01-TEMPLATES/standalone/standalone-multi-channel-inventory-sync.md +1185 -1185
  37. package/docs/01-TEMPLATES/standalone/standalone-multi-source-aggregation.md +1462 -1462
  38. package/docs/01-TEMPLATES/standalone/standalone-s3-csv-batch-api.md +1390 -1390
  39. package/docs/01-TEMPLATES/standalone/standalone-s3-csv-inventory-to-batch.md +330 -330
  40. package/docs/01-TEMPLATES/standalone/standalone-scripts-guide.md +87 -87
  41. package/docs/01-TEMPLATES/standalone/standalone-sftp-xml-graphql.md +1444 -1444
  42. package/docs/01-TEMPLATES/standalone/standalone-webhook-payload-processing.md +688 -688
  43. package/docs/01-TEMPLATES/versori/business-examples/business-examples-dropship-order-routing.md +193 -193
  44. package/docs/01-TEMPLATES/versori/business-examples/business-examples-graphql-parquet-extraction.md +518 -518
  45. package/docs/01-TEMPLATES/versori/business-examples/business-examples-inter-location-transfers.md +2162 -2162
  46. package/docs/01-TEMPLATES/versori/business-examples/business-examples-pre-order-allocation.md +2226 -2226
  47. package/docs/01-TEMPLATES/versori/business-examples/business-scenarios-guide.md +87 -87
  48. package/docs/01-TEMPLATES/versori/patterns/versori-patterns-connection-validation-pattern.md +656 -656
  49. package/docs/01-TEMPLATES/versori/patterns/versori-patterns-dual-workflow-connector.md +835 -835
  50. package/docs/01-TEMPLATES/versori/patterns/versori-patterns-guide.md +108 -108
  51. package/docs/01-TEMPLATES/versori/patterns/versori-patterns-kv-state-management.md +1533 -1533
  52. package/docs/01-TEMPLATES/versori/patterns/versori-patterns-xml-response-patterns.md +1160 -1160
  53. package/docs/01-TEMPLATES/versori/versori-platform-guide.md +201 -201
  54. package/docs/01-TEMPLATES/versori/webhooks/template-webhook-asn-purchase-order.md +1906 -1906
  55. package/docs/01-TEMPLATES/versori/webhooks/template-webhook-dropship-routing.md +1074 -1074
  56. package/docs/01-TEMPLATES/versori/webhooks/template-webhook-flash-sale-reserve.md +1395 -1395
  57. package/docs/01-TEMPLATES/versori/webhooks/template-webhook-generic-xml-order.md +888 -888
  58. package/docs/01-TEMPLATES/versori/webhooks/template-webhook-payment-gateway-integration.md +2478 -2478
  59. package/docs/01-TEMPLATES/versori/webhooks/template-webhook-rma-returns-comprehensive.md +2240 -2240
  60. package/docs/01-TEMPLATES/versori/webhooks/template-webhook-xml-order-ingestion.md +2029 -2029
  61. package/docs/01-TEMPLATES/versori/webhooks/webhook-templates-guide.md +140 -140
  62. package/docs/01-TEMPLATES/versori/workflows/_examples/sample-data/inventory-mapping.json +20 -20
  63. package/docs/01-TEMPLATES/versori/workflows/_examples/sample-data/products_2025-01-22.csv +11 -11
  64. package/docs/01-TEMPLATES/versori/workflows/_examples/sample-data/sample-data-guide.md +34 -34
  65. package/docs/01-TEMPLATES/versori/workflows/_examples/workflow-examples-guide.md +36 -36
  66. package/docs/01-TEMPLATES/versori/workflows/extraction/extraction-modes-guide.md +1038 -1038
  67. package/docs/01-TEMPLATES/versori/workflows/extraction/extraction-workflows-guide.md +138 -138
  68. package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/graphql-extraction-guide.md +63 -63
  69. package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-fulfillments-to-sftp-csv.md +2062 -2062
  70. package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-fulfillments-to-sftp-xml.md +2294 -2294
  71. package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-inventory-positions-to-s3-csv.md +2461 -2461
  72. package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-inventory-positions-to-sftp-xml.md +2529 -2529
  73. package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-inventory-quantities-to-s3-csv.md +2464 -2464
  74. package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-inventory-quantities-to-s3-json.md +1959 -1959
  75. package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-orders-to-s3-csv.md +1953 -1953
  76. package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-orders-to-sftp-xml.md +2541 -2541
  77. package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-products-to-s3-json.md +2384 -2384
  78. package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-products-to-sftp-xml.md +2445 -2445
  79. package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-virtual-positions-to-s3-csv.md +2355 -2355
  80. package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-virtual-positions-to-s3-json.md +2042 -2042
  81. package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-virtual-positions-to-sftp-xml.md +2726 -2726
  82. package/docs/01-TEMPLATES/versori/workflows/ingestion/batch-api/batch-api-guide.md +206 -206
  83. package/docs/01-TEMPLATES/versori/workflows/ingestion/batch-api/template-ingestion-cycle-count-reconciliation.md +2030 -2030
  84. package/docs/01-TEMPLATES/versori/workflows/ingestion/batch-api/template-ingestion-multi-channel-inventory-sync.md +1882 -1882
  85. package/docs/01-TEMPLATES/versori/workflows/ingestion/batch-api/template-ingestion-s3-csv-inventory-batch.md +2827 -2827
  86. package/docs/01-TEMPLATES/versori/workflows/ingestion/batch-api/template-ingestion-s3-json-inventory-batch.md +1952 -1952
  87. package/docs/01-TEMPLATES/versori/workflows/ingestion/batch-api/template-ingestion-s3-xml-inventory-batch.md +3289 -3289
  88. package/docs/01-TEMPLATES/versori/workflows/ingestion/batch-api/template-ingestion-sftp-csv-inventory-batch.md +3064 -3064
  89. package/docs/01-TEMPLATES/versori/workflows/ingestion/batch-api/template-ingestion-sftp-json-inventory-batch.md +3238 -3238
  90. package/docs/01-TEMPLATES/versori/workflows/ingestion/batch-api/template-ingestion-sftp-xml-inventory-batch.md +2977 -2977
  91. package/docs/01-TEMPLATES/versori/workflows/ingestion/event-api/event-api-guide.md +321 -321
  92. package/docs/01-TEMPLATES/versori/workflows/ingestion/event-api/template-ingestion-payload-json-order-cancel-event.md +959 -959
  93. package/docs/01-TEMPLATES/versori/workflows/ingestion/event-api/template-ingestion-payload-xml-order-cancel-event.md +1170 -1170
  94. package/docs/01-TEMPLATES/versori/workflows/ingestion/event-api/template-ingestion-s3-csv-product-event.md +2312 -2312
  95. package/docs/01-TEMPLATES/versori/workflows/ingestion/event-api/template-ingestion-s3-json-product-event.md +2999 -2999
  96. package/docs/01-TEMPLATES/versori/workflows/ingestion/event-api/template-ingestion-s3-parquet-product-event.md +2836 -2836
  97. package/docs/01-TEMPLATES/versori/workflows/ingestion/event-api/template-ingestion-s3-xml-product-event.md +2395 -2395
  98. package/docs/01-TEMPLATES/versori/workflows/ingestion/event-api/template-ingestion-sftp-csv-product-event.md +2295 -2295
  99. package/docs/01-TEMPLATES/versori/workflows/ingestion/event-api/template-ingestion-sftp-json-product-event.md +2602 -2602
  100. package/docs/01-TEMPLATES/versori/workflows/ingestion/event-api/template-ingestion-sftp-parquet-product-event.md +2589 -2589
  101. package/docs/01-TEMPLATES/versori/workflows/ingestion/event-api/template-ingestion-sftp-xml-product-event.md +3578 -3578
  102. package/docs/01-TEMPLATES/versori/workflows/ingestion/graphql-mutations/graphql-mutations-guide.md +93 -93
  103. package/docs/01-TEMPLATES/versori/workflows/ingestion/graphql-mutations/template-ingestion-payload-json-order-update-graphql.md +1260 -1260
  104. package/docs/01-TEMPLATES/versori/workflows/ingestion/graphql-mutations/template-ingestion-payload-xml-order-update-graphql.md +1472 -1472
  105. package/docs/01-TEMPLATES/versori/workflows/ingestion/graphql-mutations/template-ingestion-s3-csv-control-graphql.md +2417 -2417
  106. package/docs/01-TEMPLATES/versori/workflows/ingestion/graphql-mutations/template-ingestion-s3-csv-location-graphql.md +2811 -2811
  107. package/docs/01-TEMPLATES/versori/workflows/ingestion/graphql-mutations/template-ingestion-s3-csv-price-graphql.md +2619 -2619
  108. package/docs/01-TEMPLATES/versori/workflows/ingestion/graphql-mutations/template-ingestion-s3-json-location-graphql.md +2807 -2807
  109. package/docs/01-TEMPLATES/versori/workflows/ingestion/graphql-mutations/template-ingestion-s3-xml-location-graphql.md +2373 -2373
  110. package/docs/01-TEMPLATES/versori/workflows/ingestion/graphql-mutations/template-ingestion-sftp-csv-control-graphql.md +2740 -2740
  111. package/docs/01-TEMPLATES/versori/workflows/ingestion/graphql-mutations/template-ingestion-sftp-csv-location-graphql.md +2760 -2760
  112. package/docs/01-TEMPLATES/versori/workflows/ingestion/graphql-mutations/template-ingestion-sftp-json-location-graphql.md +1710 -1710
  113. package/docs/01-TEMPLATES/versori/workflows/ingestion/ingestion-workflows-guide.md +136 -136
  114. package/docs/01-TEMPLATES/versori/workflows/rubix-webhooks/rubix-webhooks-guide.md +520 -520
  115. package/docs/01-TEMPLATES/versori/workflows/rubix-webhooks/template-webhook-rubix-fulfilment-to-sftp-xml-inline.md +1418 -1418
  116. package/docs/01-TEMPLATES/versori/workflows/rubix-webhooks/template-webhook-rubix-fulfilment-to-sftp-xml-universal-mapper.md +1785 -1785
  117. package/docs/01-TEMPLATES/versori/workflows/rubix-webhooks/template-webhook-rubix-order-attribute-update.md +824 -824
  118. package/docs/01-TEMPLATES/versori/workflows/workflows-overview-guide.md +646 -646
  119. package/docs/02-CORE-GUIDES/advanced-services/advanced-services-batch-archival.md +724 -724
  120. package/docs/02-CORE-GUIDES/advanced-services/advanced-services-job-tracker.md +627 -627
  121. package/docs/02-CORE-GUIDES/advanced-services/advanced-services-partial-batch-recovery.md +561 -561
  122. package/docs/02-CORE-GUIDES/advanced-services/advanced-services-quick-reference.md +367 -367
  123. package/docs/02-CORE-GUIDES/advanced-services/advanced-services-readme.md +407 -407
  124. package/docs/02-CORE-GUIDES/advanced-services/readme.md +49 -49
  125. package/docs/02-CORE-GUIDES/api-reference/api-reference-quick-reference.md +548 -548
  126. package/docs/02-CORE-GUIDES/api-reference/event-api-input-output-reference.md +702 -1171
  127. package/docs/02-CORE-GUIDES/api-reference/examples/client-initialization.ts +286 -286
  128. package/docs/02-CORE-GUIDES/api-reference/graphql-error-classification.md +337 -337
  129. package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-01-client-api.md +399 -520
  130. package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-03-authentication.md +199 -199
  131. package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-04-graphql-mapping.md +925 -925
  132. package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-05-services.md +1198 -1198
  133. package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-06-data-sources.md +1083 -1083
  134. package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-07-parsers.md +1097 -1097
  135. package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-08-pagination.md +513 -513
  136. package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-08-types.md +545 -597
  137. package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-09-error-handling.md +527 -527
  138. package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-09-webhook-validation.md +514 -514
  139. package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-10-extraction.md +557 -557
  140. package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-10-utilities.md +412 -412
  141. package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-11-cli-tools.md +423 -423
  142. package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-11-error-handling.md +716 -716
  143. package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-12-analyze-source-structure.md +518 -518
  144. package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-12-partial-responses.md +212 -212
  145. package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-12-testing.md +300 -300
  146. package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-13-resolver-builder.md +322 -322
  147. package/docs/02-CORE-GUIDES/api-reference/readme.md +279 -279
  148. package/docs/02-CORE-GUIDES/auto-pagination/auto-pagination-quick-reference.md +351 -351
  149. package/docs/02-CORE-GUIDES/auto-pagination/auto-pagination-readme.md +277 -277
  150. package/docs/02-CORE-GUIDES/auto-pagination/examples/auto-pagination-readme.md +178 -178
  151. package/docs/02-CORE-GUIDES/auto-pagination/examples/common-patterns.ts +351 -351
  152. package/docs/02-CORE-GUIDES/auto-pagination/examples/paginate-products.ts +384 -384
  153. package/docs/02-CORE-GUIDES/auto-pagination/examples/paginate-virtual-positions.ts +308 -308
  154. package/docs/02-CORE-GUIDES/auto-pagination/modules/auto-pagination-01-foundations.md +470 -470
  155. package/docs/02-CORE-GUIDES/auto-pagination/modules/auto-pagination-02-quick-start.md +713 -713
  156. package/docs/02-CORE-GUIDES/auto-pagination/modules/auto-pagination-03-configuration.md +754 -754
  157. package/docs/02-CORE-GUIDES/auto-pagination/modules/auto-pagination-04-advanced-patterns.md +732 -732
  158. package/docs/02-CORE-GUIDES/auto-pagination/modules/auto-pagination-05-sdk-integration.md +847 -847
  159. package/docs/02-CORE-GUIDES/auto-pagination/modules/auto-pagination-06-troubleshooting.md +359 -359
  160. package/docs/02-CORE-GUIDES/auto-pagination/modules/auto-pagination-07-api-reference.md +462 -462
  161. package/docs/02-CORE-GUIDES/auto-pagination/readme.md +54 -54
  162. package/docs/02-CORE-GUIDES/data-sources/data-sources-file-operations-error-handling.md +1487 -1487
  163. package/docs/02-CORE-GUIDES/data-sources/data-sources-quick-reference.md +836 -836
  164. package/docs/02-CORE-GUIDES/data-sources/data-sources-readme.md +276 -276
  165. package/docs/02-CORE-GUIDES/data-sources/data-sources-sftp-credential-access-security.md +553 -553
  166. package/docs/02-CORE-GUIDES/data-sources/examples/common-patterns.ts +409 -409
  167. package/docs/02-CORE-GUIDES/data-sources/examples/data-sources-readme.md +178 -178
  168. package/docs/02-CORE-GUIDES/data-sources/examples/s3-operations.ts +308 -308
  169. package/docs/02-CORE-GUIDES/data-sources/examples/sftp-operations.ts +371 -371
  170. package/docs/02-CORE-GUIDES/data-sources/modules/data-sources-01-foundations.md +735 -735
  171. package/docs/02-CORE-GUIDES/data-sources/modules/data-sources-02-s3-operations.md +1302 -1302
  172. package/docs/02-CORE-GUIDES/data-sources/modules/data-sources-03-sftp-operations.md +1379 -1379
  173. package/docs/02-CORE-GUIDES/data-sources/modules/data-sources-04-file-patterns.md +941 -941
  174. package/docs/02-CORE-GUIDES/data-sources/modules/data-sources-05-advanced-topics.md +813 -813
  175. package/docs/02-CORE-GUIDES/data-sources/modules/data-sources-06-integration-patterns.md +486 -486
  176. package/docs/02-CORE-GUIDES/data-sources/modules/data-sources-07-troubleshooting.md +387 -387
  177. package/docs/02-CORE-GUIDES/data-sources/modules/data-sources-08-api-reference.md +417 -417
  178. package/docs/02-CORE-GUIDES/data-sources/readme.md +77 -77
  179. package/docs/02-CORE-GUIDES/error-handling-guide.md +936 -936
  180. package/docs/02-CORE-GUIDES/extraction/examples/02-core-guides-extraction-readme.md +116 -116
  181. package/docs/02-CORE-GUIDES/extraction/examples/common-patterns.ts +428 -428
  182. package/docs/02-CORE-GUIDES/extraction/examples/extract-inventory-basic.ts +187 -187
  183. package/docs/02-CORE-GUIDES/extraction/extraction-quick-reference.md +596 -596
  184. package/docs/02-CORE-GUIDES/extraction/modules/02-core-guides-extraction-01-foundations.md +514 -514
  185. package/docs/02-CORE-GUIDES/extraction/modules/02-core-guides-extraction-02-basic-extraction.md +823 -823
  186. package/docs/02-CORE-GUIDES/extraction/modules/02-core-guides-extraction-03-parquet-processing.md +507 -507
  187. package/docs/02-CORE-GUIDES/extraction/modules/02-core-guides-extraction-04-data-enrichment.md +546 -546
  188. package/docs/02-CORE-GUIDES/extraction/modules/02-core-guides-extraction-05-transformation.md +494 -494
  189. package/docs/02-CORE-GUIDES/extraction/modules/02-core-guides-extraction-06-export-formats.md +458 -458
  190. package/docs/02-CORE-GUIDES/extraction/modules/02-core-guides-extraction-06-performance.md +138 -138
  191. package/docs/02-CORE-GUIDES/extraction/modules/02-core-guides-extraction-07-api-reference.md +148 -148
  192. package/docs/02-CORE-GUIDES/extraction/modules/02-core-guides-extraction-07-optimization.md +692 -692
  193. package/docs/02-CORE-GUIDES/extraction/modules/02-core-guides-extraction-08-extraction-orchestrator.md +1008 -1008
  194. package/docs/02-CORE-GUIDES/extraction/readme.md +151 -151
  195. package/docs/02-CORE-GUIDES/ingestion/examples/_simple-kv-store.ts +40 -40
  196. package/docs/02-CORE-GUIDES/ingestion/examples/error-recovery.ts +728 -728
  197. package/docs/02-CORE-GUIDES/ingestion/examples/event-driven.ts +501 -501
  198. package/docs/02-CORE-GUIDES/ingestion/examples/local-file-ingestion.ts +88 -88
  199. package/docs/02-CORE-GUIDES/ingestion/examples/parquet-ingestion.ts +117 -117
  200. package/docs/02-CORE-GUIDES/ingestion/examples/performance-optimized.ts +647 -647
  201. package/docs/02-CORE-GUIDES/ingestion/examples/s3-csv-ingestion.ts +169 -169
  202. package/docs/02-CORE-GUIDES/ingestion/examples/sftp-csv-ingestion.ts +134 -134
  203. package/docs/02-CORE-GUIDES/ingestion/ingestion-quick-reference.md +546 -546
  204. package/docs/02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-01-introduction.md +626 -626
  205. package/docs/02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-02-quick-start.md +658 -658
  206. package/docs/02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-03-data-sources.md +1052 -1052
  207. package/docs/02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-04-field-mapping.md +763 -763
  208. package/docs/02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-05-advanced-parsers.md +676 -676
  209. package/docs/02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-06-batch-api.md +1295 -1295
  210. package/docs/02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-07-api-reference.md +138 -138
  211. package/docs/02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-07-state-management.md +1037 -1037
  212. package/docs/02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-08-performance-optimization.md +1349 -1349
  213. package/docs/02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-09-best-practices.md +1893 -1893
  214. package/docs/02-CORE-GUIDES/ingestion/readme.md +160 -160
  215. package/docs/02-CORE-GUIDES/logging-guide.md +585 -585
  216. package/docs/02-CORE-GUIDES/mapping/error-handling-patterns.md +401 -401
  217. package/docs/02-CORE-GUIDES/mapping/examples/02-core-guides-mapping-readme.md +128 -128
  218. package/docs/02-CORE-GUIDES/mapping/examples/common-patterns.ts +273 -273
  219. package/docs/02-CORE-GUIDES/mapping/examples/csv-location-ingestion.json +36 -36
  220. package/docs/02-CORE-GUIDES/mapping/examples/csv-mapping.ts +242 -242
  221. package/docs/02-CORE-GUIDES/mapping/examples/graphql-to-parquet-extraction.json +36 -36
  222. package/docs/02-CORE-GUIDES/mapping/examples/json-mapping.ts +213 -213
  223. package/docs/02-CORE-GUIDES/mapping/examples/json-product-to-mutation.json +48 -48
  224. package/docs/02-CORE-GUIDES/mapping/examples/xml-mapping.ts +291 -291
  225. package/docs/02-CORE-GUIDES/mapping/examples/xml-order-to-mutation.json +45 -45
  226. package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/graphql-mutation-mapping-quick-reference.md +463 -463
  227. package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/graphql-mutation-mapping-readme.md +227 -227
  228. package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-01-introduction.md +222 -222
  229. package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-02-quick-start.md +351 -351
  230. package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-03-schema-validation.md +569 -569
  231. package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-04-mapping-patterns.md +471 -471
  232. package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-05-configuration-reference.md +611 -611
  233. package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-06-advanced-xpath.md +148 -148
  234. package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-06-path-syntax.md +464 -464
  235. package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-07-api-reference.md +94 -94
  236. package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-07-array-handling.md +307 -307
  237. package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-08-custom-resolvers.md +544 -544
  238. package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-09-advanced-patterns.md +427 -427
  239. package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-10-hooks-and-variables.md +336 -336
  240. package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-11-error-handling.md +488 -488
  241. package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-12-arguments-vs-nodes.md +383 -383
  242. package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-13-best-practices.md +477 -477
  243. package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/readme.md +62 -62
  244. package/docs/02-CORE-GUIDES/mapping/mapping-format-decision-tree.md +480 -480
  245. package/docs/02-CORE-GUIDES/mapping/mapping-graphql-alias-batching-guide.md +820 -820
  246. package/docs/02-CORE-GUIDES/mapping/mapping-javascript-objects.md +2369 -2369
  247. package/docs/02-CORE-GUIDES/mapping/mapping-mapper-comparison-guide.md +682 -682
  248. package/docs/02-CORE-GUIDES/mapping/modules/02-core-guides-mapping-07-api-reference.md +1327 -1327
  249. package/docs/02-CORE-GUIDES/mapping/modules/02-core-guides-mapping-08-error-handling.md +1142 -1142
  250. package/docs/02-CORE-GUIDES/mapping/modules/mapping-04-use-cases.md +891 -891
  251. package/docs/02-CORE-GUIDES/mapping/modules/mapping-06-helpers-resolvers.md +1126 -1126
  252. package/docs/02-CORE-GUIDES/mapping/modules/mapping-06-sdk-resolvers.md +199 -199
  253. package/docs/02-CORE-GUIDES/mapping/modules/mapping-07-api-reference.md +1319 -1319
  254. package/docs/02-CORE-GUIDES/mapping/readme.md +178 -178
  255. package/docs/02-CORE-GUIDES/mapping/resolver-registration.md +410 -410
  256. package/docs/02-CORE-GUIDES/mapping/resolvers/examples/common-patterns.ts +226 -226
  257. package/docs/02-CORE-GUIDES/mapping/resolvers/examples/custom-resolvers.ts +227 -227
  258. package/docs/02-CORE-GUIDES/mapping/resolvers/examples/sdk-resolvers-usage.ts +203 -203
  259. package/docs/02-CORE-GUIDES/mapping/resolvers/mapping-resolvers-readme.md +274 -274
  260. package/docs/02-CORE-GUIDES/mapping/resolvers/mapping-resolvers-resolver-api-reference.md +679 -679
  261. package/docs/02-CORE-GUIDES/mapping/resolvers/mapping-resolvers-resolver-cookbook.md +826 -826
  262. package/docs/02-CORE-GUIDES/mapping/resolvers/mapping-resolvers-resolver-guide.md +1330 -1330
  263. package/docs/02-CORE-GUIDES/mapping/resolvers/mapping-resolvers-resolver-helpers-reference.md +1437 -1437
  264. package/docs/02-CORE-GUIDES/mapping/resolvers/mapping-resolvers-resolver-parameters-reference.md +553 -553
  265. package/docs/02-CORE-GUIDES/mapping/resolvers/mapping-resolvers-resolver-troubleshooting.md +854 -854
  266. package/docs/02-CORE-GUIDES/mapping/resolvers/readme.md +75 -75
  267. package/docs/02-CORE-GUIDES/parsers/examples/02-core-guides-parsers-readme.md +161 -161
  268. package/docs/02-CORE-GUIDES/parsers/examples/csv-parser-examples.ts +110 -110
  269. package/docs/02-CORE-GUIDES/parsers/examples/json-parser-examples.ts +33 -33
  270. package/docs/02-CORE-GUIDES/parsers/examples/parquet-parser-examples.ts +47 -47
  271. package/docs/02-CORE-GUIDES/parsers/examples/xml-parser-examples.ts +38 -38
  272. package/docs/02-CORE-GUIDES/parsers/modules/02-core-guides-parsers-01-foundations.md +355 -355
  273. package/docs/02-CORE-GUIDES/parsers/modules/02-core-guides-parsers-02-csv-parser.md +772 -772
  274. package/docs/02-CORE-GUIDES/parsers/modules/02-core-guides-parsers-03-json-parser.md +789 -789
  275. package/docs/02-CORE-GUIDES/parsers/modules/02-core-guides-parsers-04-xml-parser.md +857 -857
  276. package/docs/02-CORE-GUIDES/parsers/modules/02-core-guides-parsers-05-parquet-parser.md +603 -603
  277. package/docs/02-CORE-GUIDES/parsers/modules/02-core-guides-parsers-06-integration-patterns.md +702 -702
  278. package/docs/02-CORE-GUIDES/parsers/modules/02-core-guides-parsers-06-streaming.md +121 -121
  279. package/docs/02-CORE-GUIDES/parsers/modules/02-core-guides-parsers-07-api-reference.md +89 -89
  280. package/docs/02-CORE-GUIDES/parsers/modules/02-core-guides-parsers-07-troubleshooting.md +727 -727
  281. package/docs/02-CORE-GUIDES/parsers/parsers-quick-reference.md +482 -482
  282. package/docs/02-CORE-GUIDES/parsers/parsers-readme.md +258 -258
  283. package/docs/02-CORE-GUIDES/parsers/readme.md +65 -65
  284. package/docs/02-CORE-GUIDES/readme.md +194 -194
  285. package/docs/02-CORE-GUIDES/webhook-validation/examples/basic-validation.ts +108 -108
  286. package/docs/02-CORE-GUIDES/webhook-validation/examples/common-patterns.ts +316 -316
  287. package/docs/02-CORE-GUIDES/webhook-validation/examples/webhook-validation-readme.md +61 -61
  288. package/docs/02-CORE-GUIDES/webhook-validation/modules/webhook-validation-01-foundations.md +440 -440
  289. package/docs/02-CORE-GUIDES/webhook-validation/modules/webhook-validation-02-quick-start.md +525 -525
  290. package/docs/02-CORE-GUIDES/webhook-validation/modules/webhook-validation-03-versori-integration.md +741 -741
  291. package/docs/02-CORE-GUIDES/webhook-validation/modules/webhook-validation-04-platform-integration.md +629 -629
  292. package/docs/02-CORE-GUIDES/webhook-validation/modules/webhook-validation-05-configuration.md +535 -535
  293. package/docs/02-CORE-GUIDES/webhook-validation/modules/webhook-validation-06-error-handling.md +611 -611
  294. package/docs/02-CORE-GUIDES/webhook-validation/modules/webhook-validation-06-troubleshooting.md +124 -124
  295. package/docs/02-CORE-GUIDES/webhook-validation/modules/webhook-validation-07-api-reference.md +511 -511
  296. package/docs/02-CORE-GUIDES/webhook-validation/modules/webhook-validation-08-rubix-webhooks.md +590 -590
  297. package/docs/02-CORE-GUIDES/webhook-validation/modules/webhook-validation-09-rubix-event-vs-http-call.md +432 -432
  298. package/docs/02-CORE-GUIDES/webhook-validation/readme.md +239 -239
  299. package/docs/02-CORE-GUIDES/webhook-validation/webhook-validation-quick-reference.md +392 -392
  300. package/docs/03-PATTERN-GUIDES/connector-scenarios/connector-scenarios-quick-reference.md +498 -498
  301. package/docs/03-PATTERN-GUIDES/connector-scenarios/connector-scenarios-readme.md +313 -313
  302. package/docs/03-PATTERN-GUIDES/connector-scenarios/examples/common-patterns.ts +612 -612
  303. package/docs/03-PATTERN-GUIDES/connector-scenarios/examples/connector-scenarios-readme.md +253 -253
  304. package/docs/03-PATTERN-GUIDES/connector-scenarios/modules/connector-scenarios-01-foundations.md +452 -452
  305. package/docs/03-PATTERN-GUIDES/connector-scenarios/modules/connector-scenarios-02-simple-scenarios.md +681 -681
  306. package/docs/03-PATTERN-GUIDES/connector-scenarios/modules/connector-scenarios-03-intermediate-scenarios.md +637 -637
  307. package/docs/03-PATTERN-GUIDES/connector-scenarios/modules/connector-scenarios-04-advanced-scenarios.md +650 -650
  308. package/docs/03-PATTERN-GUIDES/connector-scenarios/modules/connector-scenarios-05-bidirectional-sync.md +233 -233
  309. package/docs/03-PATTERN-GUIDES/connector-scenarios/modules/connector-scenarios-06-production-patterns.md +442 -442
  310. package/docs/03-PATTERN-GUIDES/connector-scenarios/modules/connector-scenarios-07-reference.md +445 -445
  311. package/docs/03-PATTERN-GUIDES/connector-scenarios/readme.md +31 -31
  312. package/docs/03-PATTERN-GUIDES/enterprise-integration-patterns.md +1528 -1528
  313. package/docs/03-PATTERN-GUIDES/error-handling/comprehensive-error-handling-guide.md +1437 -1437
  314. package/docs/03-PATTERN-GUIDES/error-handling/error-handling-quick-reference.md +390 -390
  315. package/docs/03-PATTERN-GUIDES/error-handling/examples/common-patterns.ts +438 -438
  316. package/docs/03-PATTERN-GUIDES/error-handling/modules/error-handling-01-foundations.md +362 -362
  317. package/docs/03-PATTERN-GUIDES/error-handling/modules/error-handling-02-error-types.md +850 -850
  318. package/docs/03-PATTERN-GUIDES/error-handling/modules/error-handling-03-utf8-handling.md +456 -456
  319. package/docs/03-PATTERN-GUIDES/error-handling/modules/error-handling-04-error-scenarios.md +658 -658
  320. package/docs/03-PATTERN-GUIDES/error-handling/modules/error-handling-05-calling-patterns.md +671 -671
  321. package/docs/03-PATTERN-GUIDES/error-handling/modules/error-handling-06-retry-strategies.md +1034 -1034
  322. package/docs/03-PATTERN-GUIDES/error-handling/modules/error-handling-07-monitoring.md +653 -653
  323. package/docs/03-PATTERN-GUIDES/error-handling/modules/error-handling-08-api-reference.md +847 -847
  324. package/docs/03-PATTERN-GUIDES/error-handling/readme.md +36 -36
  325. package/docs/03-PATTERN-GUIDES/examples/__tests__/readme.md +40 -40
  326. package/docs/03-PATTERN-GUIDES/examples/__tests__/resolver-examples.test.js +282 -282
  327. package/docs/03-PATTERN-GUIDES/examples/test-data/03-pattern-guides-readme.md +110 -110
  328. package/docs/03-PATTERN-GUIDES/examples/test-data/canonical-inventory.json +123 -123
  329. package/docs/03-PATTERN-GUIDES/examples/test-data/canonical-order.json +171 -171
  330. package/docs/03-PATTERN-GUIDES/examples/test-data/readme.md +28 -28
  331. package/docs/03-PATTERN-GUIDES/extraction/extraction-readme.md +15 -15
  332. package/docs/03-PATTERN-GUIDES/extraction/readme.md +25 -25
  333. package/docs/03-PATTERN-GUIDES/file-operations/examples/common-patterns.ts +407 -407
  334. package/docs/03-PATTERN-GUIDES/file-operations/examples/file-operations-readme.md +142 -142
  335. package/docs/03-PATTERN-GUIDES/file-operations/file-operations-quick-reference.md +462 -462
  336. package/docs/03-PATTERN-GUIDES/file-operations/file-operations-readme.md +379 -379
  337. package/docs/03-PATTERN-GUIDES/file-operations/modules/file-operations-01-foundations.md +430 -430
  338. package/docs/03-PATTERN-GUIDES/file-operations/modules/file-operations-02-quick-start.md +484 -484
  339. package/docs/03-PATTERN-GUIDES/file-operations/modules/file-operations-03-s3-operations.md +507 -507
  340. package/docs/03-PATTERN-GUIDES/file-operations/modules/file-operations-04-sftp-operations.md +963 -963
  341. package/docs/03-PATTERN-GUIDES/file-operations/modules/file-operations-05-streaming-performance.md +503 -503
  342. package/docs/03-PATTERN-GUIDES/file-operations/modules/file-operations-06-archive-patterns.md +386 -386
  343. package/docs/03-PATTERN-GUIDES/file-operations/modules/file-operations-06-error-handling.md +117 -117
  344. package/docs/03-PATTERN-GUIDES/file-operations/modules/file-operations-07-api-reference.md +78 -78
  345. package/docs/03-PATTERN-GUIDES/file-operations/modules/file-operations-07-testing-troubleshooting.md +567 -567
  346. package/docs/03-PATTERN-GUIDES/file-operations/modules/file-operations-08-api-reference.md +1055 -1055
  347. package/docs/03-PATTERN-GUIDES/file-operations/readme.md +32 -32
  348. package/docs/03-PATTERN-GUIDES/ingestion/ingestion-readme.md +15 -15
  349. package/docs/03-PATTERN-GUIDES/ingestion/readme.md +25 -25
  350. package/docs/03-PATTERN-GUIDES/integration-patterns/examples/batch-processing.ts +130 -130
  351. package/docs/03-PATTERN-GUIDES/integration-patterns/examples/common-patterns.ts +360 -360
  352. package/docs/03-PATTERN-GUIDES/integration-patterns/examples/delta-sync.ts +130 -130
  353. package/docs/03-PATTERN-GUIDES/integration-patterns/examples/integration-patterns-readme.md +100 -100
  354. package/docs/03-PATTERN-GUIDES/integration-patterns/examples/real-time-webhook.ts +398 -398
  355. package/docs/03-PATTERN-GUIDES/integration-patterns/integration-patterns-quick-reference.md +962 -962
  356. package/docs/03-PATTERN-GUIDES/integration-patterns/integration-patterns-readme.md +134 -134
  357. package/docs/03-PATTERN-GUIDES/integration-patterns/modules/integration-patterns-01-real-time-processing.md +991 -991
  358. package/docs/03-PATTERN-GUIDES/integration-patterns/modules/integration-patterns-02-batch-processing.md +1547 -1547
  359. package/docs/03-PATTERN-GUIDES/integration-patterns/modules/integration-patterns-03-delta-sync.md +1108 -1108
  360. package/docs/03-PATTERN-GUIDES/integration-patterns/modules/integration-patterns-04-webhook-patterns.md +1181 -1181
  361. package/docs/03-PATTERN-GUIDES/integration-patterns/modules/integration-patterns-05-error-handling.md +1061 -1061
  362. package/docs/03-PATTERN-GUIDES/integration-patterns/modules/integration-patterns-06-advanced-integration-services.md +1547 -1547
  363. package/docs/03-PATTERN-GUIDES/integration-patterns/modules/integration-patterns-06-performance.md +109 -109
  364. package/docs/03-PATTERN-GUIDES/integration-patterns/modules/integration-patterns-07-api-reference.md +34 -34
  365. package/docs/03-PATTERN-GUIDES/integration-patterns/readme.md +30 -30
  366. package/docs/03-PATTERN-GUIDES/logging-minimal-mode.md +128 -128
  367. package/docs/03-PATTERN-GUIDES/multiple-connections/examples/common-patterns.ts +380 -380
  368. package/docs/03-PATTERN-GUIDES/multiple-connections/examples/multiple-connections-readme.md +139 -139
  369. package/docs/03-PATTERN-GUIDES/multiple-connections/examples/parallel-root-connections.ts +149 -149
  370. package/docs/03-PATTERN-GUIDES/multiple-connections/examples/real-world-scenarios.ts +405 -405
  371. package/docs/03-PATTERN-GUIDES/multiple-connections/modules/multiple-connections-01-foundations.md +378 -378
  372. package/docs/03-PATTERN-GUIDES/multiple-connections/modules/multiple-connections-02-quick-start.md +566 -566
  373. package/docs/03-PATTERN-GUIDES/multiple-connections/modules/multiple-connections-03-targeting-connections.md +659 -659
  374. package/docs/03-PATTERN-GUIDES/multiple-connections/modules/multiple-connections-04-parallel-queries.md +656 -656
  375. package/docs/03-PATTERN-GUIDES/multiple-connections/modules/multiple-connections-05-best-practices.md +624 -624
  376. package/docs/03-PATTERN-GUIDES/multiple-connections/modules/multiple-connections-06-api-reference.md +824 -824
  377. package/docs/03-PATTERN-GUIDES/multiple-connections/modules/multiple-connections-06-versori.md +119 -119
  378. package/docs/03-PATTERN-GUIDES/multiple-connections/modules/multiple-connections-07-api-reference.md +87 -87
  379. package/docs/03-PATTERN-GUIDES/multiple-connections/multiple-connections-quick-reference.md +353 -353
  380. package/docs/03-PATTERN-GUIDES/multiple-connections/multiple-connections-readme.md +270 -270
  381. package/docs/03-PATTERN-GUIDES/multiple-connections/readme.md +30 -30
  382. package/docs/03-PATTERN-GUIDES/pagination/pagination-readme.md +14 -14
  383. package/docs/03-PATTERN-GUIDES/pagination/readme.md +24 -24
  384. package/docs/03-PATTERN-GUIDES/parquet/examples/common-patterns.ts +180 -180
  385. package/docs/03-PATTERN-GUIDES/parquet/examples/read-parquet.ts +48 -48
  386. package/docs/03-PATTERN-GUIDES/parquet/examples/write-parquet.ts +65 -65
  387. package/docs/03-PATTERN-GUIDES/parquet/modules/03-pattern-guides-parquet-01-introduction.md +393 -393
  388. package/docs/03-PATTERN-GUIDES/parquet/modules/03-pattern-guides-parquet-02-quick-start.md +572 -572
  389. package/docs/03-PATTERN-GUIDES/parquet/modules/03-pattern-guides-parquet-03-reading-parquet.md +525 -525
  390. package/docs/03-PATTERN-GUIDES/parquet/modules/03-pattern-guides-parquet-04-writing-parquet.md +554 -554
  391. package/docs/03-PATTERN-GUIDES/parquet/modules/03-pattern-guides-parquet-05-graphql-extraction.md +405 -405
  392. package/docs/03-PATTERN-GUIDES/parquet/modules/03-pattern-guides-parquet-06-performance.md +104 -104
  393. package/docs/03-PATTERN-GUIDES/parquet/modules/03-pattern-guides-parquet-06-s3-integration.md +511 -511
  394. package/docs/03-PATTERN-GUIDES/parquet/modules/03-pattern-guides-parquet-07-api-reference.md +90 -90
  395. package/docs/03-PATTERN-GUIDES/parquet/modules/03-pattern-guides-parquet-07-performance-optimization.md +525 -525
  396. package/docs/03-PATTERN-GUIDES/parquet/modules/03-pattern-guides-parquet-08-best-practices.md +712 -712
  397. package/docs/03-PATTERN-GUIDES/parquet/parquet-quick-reference.md +683 -683
  398. package/docs/03-PATTERN-GUIDES/parquet/parquet-readme.md +248 -248
  399. package/docs/03-PATTERN-GUIDES/parquet/readme.md +32 -32
  400. package/docs/03-PATTERN-GUIDES/parsers/parsers-readme.md +12 -12
  401. package/docs/03-PATTERN-GUIDES/parsers/readme.md +24 -24
  402. package/docs/03-PATTERN-GUIDES/readme.md +159 -159
  403. package/docs/03-PATTERN-GUIDES/webhooks/readme.md +24 -24
  404. package/docs/03-PATTERN-GUIDES/webhooks/webhooks-readme.md +8 -8
  405. package/docs/04-REFERENCE/architecture/architecture-01-overview.md +427 -427
  406. package/docs/04-REFERENCE/architecture/architecture-02-client-architecture.md +424 -424
  407. package/docs/04-REFERENCE/architecture/architecture-03-data-flow.md +690 -690
  408. package/docs/04-REFERENCE/architecture/architecture-04-service-layer.md +834 -834
  409. package/docs/04-REFERENCE/architecture/architecture-05-integration-architecture.md +655 -655
  410. package/docs/04-REFERENCE/architecture/architecture-06-state-management.md +653 -653
  411. package/docs/04-REFERENCE/architecture/architecture-adding-new-data-sources.md +686 -686
  412. package/docs/04-REFERENCE/architecture/readme.md +279 -279
  413. package/docs/04-REFERENCE/platforms/deno/readme.md +117 -117
  414. package/docs/04-REFERENCE/platforms/nodejs/readme.md +146 -146
  415. package/docs/04-REFERENCE/platforms/readme.md +135 -135
  416. package/docs/04-REFERENCE/platforms/versori/modules/platforms-versori-01-introduction.md +398 -398
  417. package/docs/04-REFERENCE/platforms/versori/modules/platforms-versori-02-quick-start.md +560 -560
  418. package/docs/04-REFERENCE/platforms/versori/modules/platforms-versori-03-authentication.md +757 -757
  419. package/docs/04-REFERENCE/platforms/versori/modules/platforms-versori-04-workflows.md +2476 -2476
  420. package/docs/04-REFERENCE/platforms/versori/modules/platforms-versori-05-connections.md +1167 -1167
  421. package/docs/04-REFERENCE/platforms/versori/modules/platforms-versori-06-kv-storage.md +990 -990
  422. package/docs/04-REFERENCE/platforms/versori/modules/platforms-versori-06-state-management.md +121 -121
  423. package/docs/04-REFERENCE/platforms/versori/modules/platforms-versori-07-api-reference.md +68 -68
  424. package/docs/04-REFERENCE/platforms/versori/modules/platforms-versori-07-deployment.md +731 -731
  425. package/docs/04-REFERENCE/platforms/versori/modules/platforms-versori-08-best-practices.md +1111 -1111
  426. package/docs/04-REFERENCE/platforms/versori/modules/platforms-versori-09-signature-reference.md +766 -766
  427. package/docs/04-REFERENCE/platforms/versori/platforms-versori-readme.md +299 -299
  428. package/docs/04-REFERENCE/platforms/versori/platforms-versori-s3-sftp-configuration-guide.md +1425 -1425
  429. package/docs/04-REFERENCE/platforms/versori/platforms-versori-webhook-api-key-security.md +816 -816
  430. package/docs/04-REFERENCE/platforms/versori/platforms-versori-webhook-connection-security.md +681 -681
  431. package/docs/04-REFERENCE/platforms/versori/platforms-versori-workflow-task-types.md +708 -708
  432. package/docs/04-REFERENCE/platforms/versori/readme.md +108 -108
  433. package/docs/04-REFERENCE/readme.md +148 -148
  434. package/docs/04-REFERENCE/resolver-signature/examples/advanced-resolvers.ts +482 -482
  435. package/docs/04-REFERENCE/resolver-signature/examples/async-resolvers.ts +496 -496
  436. package/docs/04-REFERENCE/resolver-signature/examples/basic-resolvers.ts +343 -343
  437. package/docs/04-REFERENCE/resolver-signature/examples/resolver-signature-readme.md +188 -188
  438. package/docs/04-REFERENCE/resolver-signature/examples/testing-resolvers.ts +463 -463
  439. package/docs/04-REFERENCE/resolver-signature/modules/resolver-signature-01-foundations.md +286 -286
  440. package/docs/04-REFERENCE/resolver-signature/modules/resolver-signature-02-parameter-reference.md +643 -643
  441. package/docs/04-REFERENCE/resolver-signature/modules/resolver-signature-03-basic-examples.md +521 -521
  442. package/docs/04-REFERENCE/resolver-signature/modules/resolver-signature-04-advanced-patterns.md +739 -739
  443. package/docs/04-REFERENCE/resolver-signature/modules/resolver-signature-05-sdk-resolvers.md +531 -531
  444. package/docs/04-REFERENCE/resolver-signature/modules/resolver-signature-06-migration-guide.md +650 -650
  445. package/docs/04-REFERENCE/resolver-signature/modules/resolver-signature-06-testing.md +125 -125
  446. package/docs/04-REFERENCE/resolver-signature/modules/resolver-signature-07-api-reference.md +794 -794
  447. package/docs/04-REFERENCE/resolver-signature/readme.md +64 -64
  448. package/docs/04-REFERENCE/resolver-signature/resolver-signature-quick-reference.md +270 -270
  449. package/docs/04-REFERENCE/resolver-signature/resolver-signature-readme.md +351 -351
  450. package/docs/04-REFERENCE/schema/fluent-commerce-schema.json +764 -764
  451. package/docs/04-REFERENCE/schema/readme.md +141 -141
  452. package/docs/04-REFERENCE/testing/examples/04-reference-testing-readme.md +158 -158
  453. package/docs/04-REFERENCE/testing/examples/fluent-testing.ts +62 -62
  454. package/docs/04-REFERENCE/testing/examples/health-check.ts +155 -155
  455. package/docs/04-REFERENCE/testing/examples/integration-test.ts +119 -119
  456. package/docs/04-REFERENCE/testing/examples/performance-test.ts +183 -183
  457. package/docs/04-REFERENCE/testing/examples/s3-testing.ts +127 -127
  458. package/docs/04-REFERENCE/testing/modules/04-reference-testing-01-foundations.md +267 -267
  459. package/docs/04-REFERENCE/testing/modules/04-reference-testing-02-s3-testing.md +599 -599
  460. package/docs/04-REFERENCE/testing/modules/04-reference-testing-03-fluent-testing.md +589 -589
  461. package/docs/04-REFERENCE/testing/modules/04-reference-testing-04-integration-testing.md +699 -699
  462. package/docs/04-REFERENCE/testing/modules/04-reference-testing-05-debugging.md +478 -478
  463. package/docs/04-REFERENCE/testing/modules/04-reference-testing-06-cicd-integration.md +463 -463
  464. package/docs/04-REFERENCE/testing/modules/04-reference-testing-06-preflight-validation.md +131 -131
  465. package/docs/04-REFERENCE/testing/modules/04-reference-testing-07-best-practices.md +499 -499
  466. package/docs/04-REFERENCE/testing/modules/04-reference-testing-07-coverage-ci.md +165 -165
  467. package/docs/04-REFERENCE/testing/modules/04-reference-testing-08-api-reference.md +634 -634
  468. package/docs/04-REFERENCE/testing/readme.md +86 -86
  469. package/docs/04-REFERENCE/testing/testing-quick-reference.md +667 -667
  470. package/docs/04-REFERENCE/testing/testing-readme.md +286 -286
  471. package/docs/04-REFERENCE/troubleshooting/readme.md +144 -144
  472. package/docs/04-REFERENCE/troubleshooting/troubleshooting-deno-sftp-compatibility.md +392 -392
  473. package/docs/template-loading-matrix.md +242 -242
  474. package/package.json +5 -3
  475. package/docs/02-CORE-GUIDES/api-reference/cli-profile-integration.md +0 -377
@@ -1,1430 +1,1430 @@
1
- # Pattern: Large File Processing & Chunking
2
-
3
- **FC Connect SDK Use Case Guide**
4
-
5
- > **SDK**: [@fluentcommerce/fc-connect-sdk](https://www.npmjs.com/package/@fluentcommerce/fc-connect-sdk)
6
- > **Version**: Use latest - `npm install @fluentcommerce/fc-connect-sdk@latest`
7
-
8
- **Context**: Enterprise-scale file ingestion with streaming, splitting, and parallel processing
9
-
10
- **Type**: Advanced Pattern
11
-
12
- **Complexity**: High
13
-
14
- **Volume**: 500MB-5GB files, 1M-10M records
15
-
16
- **Latency**: Batch processing (< 30-60 min for 10M records)
17
-
18
- **Pattern**: Streaming + chunking + parallel Batch API
19
-
20
- ## When to Use This Pattern
21
-
22
- Use this pattern when dealing with:
23
-
24
- - **Large CSV files** (>500MB, >1M records)
25
- - **Memory-constrained environments** (Lambda, containers with limited RAM)
26
- - **Time-sensitive ingestion** (need parallel processing for speed)
27
- - **Reliability requirements** (checkpoint/resume on failure)
28
- - **Progress tracking** (real-time status updates)
29
-
30
- **Volume Guidance:**
31
-
32
- - **Small** (<1K records): Use basic ingestion pattern
33
- - **Medium** (1K-100K records): Use streaming pattern (Pattern 1)
34
- - **Large** (100K-1M records): Use file chunking pattern (Pattern 2)
35
- - **Huge** (1M-10M records): Use parallel processing pattern (Pattern 3)
36
- - **Enterprise** (10M+ records): Use distributed processing pattern (Pattern 4)
37
-
38
- ## Problem Statement
39
-
40
- ### Why Splitting is Needed
41
-
42
- **Memory Constraints:**
43
-
44
- ```typescript
45
- // ❌ WRONG - Loads entire 2GB file into memory
46
- const csvContent = await fs.readFile('huge-inventory.csv', 'utf-8');
47
- const records = await csvParser.parse(csvContent); // 💥 Out of memory
48
- ```
49
-
50
- **Impact:**
51
-
52
- - Lambda 512MB: Crashes on 500MB+ files
53
- - Container 1GB: Struggles with 1GB+ files
54
- - Node.js default heap (4GB): Fails on 5GB+ files
55
-
56
- **Time Constraints:**
57
-
58
- ```typescript
59
- // ❌ WRONG - Sequential processing takes 90+ minutes
60
- for (const record of records) {
61
- await processRecord(record); // Too slow for 10M records
62
- }
63
- ```
64
-
65
- **Reliability Requirements:**
66
-
67
- ```typescript
68
- // ❌ WRONG - Network failure loses all progress
69
- await processAllRecords(records); // If fails at record 5M, restart from 0
70
- ```
71
-
72
- ### Solution Overview
73
-
74
- This guide demonstrates 4 progressive patterns:
75
-
76
- 1. **Basic Streaming** (~200 lines) - Process records as they arrive, memory-efficient
77
- 2. **File Chunking** (~300 lines) - Split large files into manageable chunks
78
- 3. **Parallel Processing** (~400 lines) - Process chunks concurrently with progress tracking
79
- 4. **Distributed Processing** (~300 lines) - Use Versori scheduled workflows for enterprise scale
80
-
81
- ## SDK Methods Used
82
-
83
- ```typescript
84
- import {
85
- createClient,
86
- // Client factory (auto-detects context)
87
- CSVParserService,
88
- // Streaming CSV parser
89
- S3DataSource,
90
- // S3 file operations
91
- UniversalMapper,
92
- // Field mapping
93
- StateService,
94
- // Progress tracking
95
- VersoriKVAdapter,
96
- // Versori state management,
97
- // Structured logging,
98
- createConsoleLogger,
99
- toStructuredLogger
100
- } from '@fluentcommerce/fc-connect-sdk';
101
- ```
102
-
103
- ---
104
-
105
- ## Pattern 1: Basic Streaming (Memory-Efficient)
106
-
107
- **Best for:** 100K-1M records, single-threaded processing, memory-constrained environments
108
-
109
- **Memory Usage:**
110
-
111
- - ❌ Without streaming: 2GB file = 2GB+ RAM (file + parsed objects)
112
- - ✅ With streaming: 2GB file = ~50MB RAM (processes records incrementally)
113
-
114
- ### Implementation
115
-
116
- ```typescript
117
- import {
118
- createClient,
119
- CSVParserService,
120
- S3DataSource,
121
- UniversalMapper,
122
- createConsoleLogger,
123
- toStructuredLogger
124
- } from '@fluentcommerce/fc-connect-sdk';
125
-
126
- const logger = createConsoleLogger();
127
-
128
- async function streamingIngestion(ctx: any) {
129
- logger.info('Starting streaming ingestion');
130
-
131
- // Create client (auto-detects Versori context)
132
- const client = await createClient(ctx);
133
-
134
- // Initialize S3 data source
135
- const s3 = new S3DataSource(
136
- {
137
- type: 'S3_CSV',
138
- connectionId: 'my-s3',
139
- name: 'Inventory Files S3',
140
- s3Config: {
141
- bucket: 'inventory-files',
142
- region: 'us-east-1',
143
- accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
144
- secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
145
- },
146
- },
147
- logger
148
- );
149
-
150
- // Define field mapping
151
- const mapper = new UniversalMapper({
152
- fields: {
153
- skuRef: { source: 'sku', required: true },
154
- locationRef: { source: 'location_code', required: true },
155
- qty: { source: 'quantity', resolver: 'sdk.parseInt' },
156
- expectedOn: { source: 'expected_date', resolver: 'sdk.formatDate' },
157
- },
158
- });
159
-
160
- // Create CSV parser with streaming enabled
161
- const csvParser = new CSVParserService();
162
-
163
- // Download file as stream (not loaded into memory)
164
- logger.info('Downloading file from S3', {
165
- key: 'inventory/large-file.csv',
166
- });
167
-
168
- const fileContent = (await s3.downloadFile('inventory/large-file.csv', {
169
- encoding: 'utf8',
170
- })) as string;
171
-
172
- // Create job for batch ingestion
173
- const job = await client.createJob({
174
- name: 'streaming-inventory-ingestion',
175
- retailerId: client.getRetailerId() || ctx.connections?.fluent_commerce?.retailerId,
176
- });
177
-
178
- logger.info('Job created', { jobId: job.id });
179
-
180
- // Statistics tracking
181
- let recordsProcessed = 0;
182
- let batchCount = 0;
183
- let errors = 0;
184
- const BATCH_SIZE = 1000;
185
- let currentBatch: any[] = [];
186
-
187
- // Stream records with batching (memory-efficient)
188
- // Records are parsed incrementally, not all at once
189
- for await (const record of csvParser.parseStreaming(fileContent, {}, 1)) {
190
- try {
191
- // Map record
192
- const mapped = await mapper.map(record);
193
-
194
- if (mapped.success && mapped.data) {
195
- currentBatch.push(mapped.data);
196
- recordsProcessed++;
197
-
198
- // Send batch when full
199
- if (currentBatch.length >= BATCH_SIZE) {
200
- await client.sendBatch(job.id, {
201
- entities: currentBatch,
202
- });
203
-
204
- batchCount++;
205
-
206
- logger.info('Batch sent', {
207
- batchNumber: batchCount,
208
- recordsProcessed,
209
- currentBatchSize: currentBatch.length,
210
- });
211
-
212
- currentBatch = []; // Clear batch (frees memory)
213
- }
214
- } else {
215
- errors++;
216
- logger.warn('Record mapping failed', {
217
- record,
218
- errors: mapped.errors,
219
- });
220
- }
221
- } catch (error) {
222
- errors++;
223
- logger.error('Record processing failed', error as Error, { record });
224
- }
225
-
226
- // Progress logging every 10K records
227
- if (recordsProcessed % 10000 === 0) {
228
- logger.info('Progress update', {
229
- recordsProcessed,
230
- batchesSent: batchCount,
231
- errors,
232
- memoryUsage: process.memoryUsage().heapUsed / 1024 / 1024 + ' MB',
233
- });
234
- }
235
- }
236
-
237
- // Send remaining records
238
- if (currentBatch.length > 0) {
239
- await client.sendBatch(job.id, {
240
- entities: currentBatch,
241
- });
242
- batchCount++;
243
- }
244
-
245
- logger.info('Streaming ingestion complete', {
246
- totalRecords: recordsProcessed,
247
- batchesSent: batchCount,
248
- errors,
249
- jobId: job.id,
250
- });
251
-
252
- return {
253
- success: true,
254
- jobId: job.id,
255
- recordsProcessed,
256
- batchesSent: batchCount,
257
- errors,
258
- };
259
- }
260
- ```
261
-
262
- **Memory Profile:**
263
-
264
- ```
265
- File Size: 2GB (5M records)
266
- RAM Usage: ~50MB peak (1000 record batches)
267
- Processing Time: ~45 minutes (sequential)
268
- ```
269
-
270
- ---
271
-
272
- ## Pattern 2: File Chunking (Split & Track)
273
-
274
- **Best for:** 1M-5M records, need checkpoint/resume, want progress visibility
275
-
276
- **Strategy:**
277
-
278
- 1. Split large file into 100K record chunks
279
- 2. Write chunks to temp S3 locations
280
- 3. Track chunk metadata in VersoriKV
281
- 4. Process chunks sequentially (can resume on failure)
282
-
283
- ### Implementation
284
-
285
- ```typescript
286
- import {
287
- createClient,
288
- CSVParserService,
289
- S3DataSource,
290
- UniversalMapper,
291
- StateService,
292
- VersoriKVAdapter,
293
- createConsoleLogger,
294
- toStructuredLogger
295
- } from '@fluentcommerce/fc-connect-sdk';
296
-
297
- const logger = createConsoleLogger();
298
-
299
- interface ChunkMetadata {
300
- chunkId: string;
301
- startRecord: number;
302
- endRecord: number;
303
- s3Key: string;
304
- recordCount: number;
305
- status: 'pending' | 'processing' | 'completed' | 'failed';
306
- processedAt?: string;
307
- error?: string;
308
- }
309
-
310
- async function chunkedIngestion(ctx: any) {
311
- logger.info('Starting chunked ingestion');
312
-
313
- // Initialize services
314
- const client = await createClient(ctx);
315
-
316
- const s3 = new S3DataSource(
317
- {
318
- type: 'S3_CSV',
319
- connectionId: 'my-s3-chunked',
320
- name: 'Inventory Files S3 Chunked',
321
- s3Config: {
322
- bucket: 'inventory-files',
323
- region: 'us-east-1',
324
- accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
325
- secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
326
- },
327
- },
328
- logger
329
- );
330
-
331
- // Initialize state management
332
- const kv = context.openKv();
333
- const kvAdapter = new VersoriKVAdapter(kv);
334
- const stateService = new StateService(logger);
335
-
336
- const SOURCE_FILE = 'inventory/huge-inventory.csv';
337
- const CHUNK_SIZE = 100000; // 100K records per chunk
338
- const workflowId = 'chunked-ingestion';
339
-
340
- // STEP 1: Check if chunking is already in progress
341
- const existingState = await stateService.getSyncState(kvAdapter, workflowId);
342
-
343
- if (existingState.isInitialized && existingState.lastSyncResult === 'partial') {
344
- logger.info('Resuming from previous run', {
345
- lastProcessedFile: existingState.lastProcessedFile,
346
- lastProcessedCount: existingState.lastProcessedCount,
347
- });
348
- }
349
-
350
- // STEP 2: Split file into chunks
351
- logger.info('Splitting file into chunks', {
352
- sourceFile: SOURCE_FILE,
353
- chunkSize: CHUNK_SIZE,
354
- });
355
-
356
- const chunks = await splitFileIntoChunks(s3, SOURCE_FILE, CHUNK_SIZE, workflowId, kvAdapter);
357
-
358
- logger.info('File split complete', {
359
- totalChunks: chunks.length,
360
- totalRecords: chunks.reduce((sum, c) => sum + c.recordCount, 0),
361
- });
362
-
363
- // STEP 3: Create job for ingestion
364
- const job = await client.createJob({
365
- name: `chunked-inventory-ingestion-${Date.now()}`,
366
- retailerId: client.getRetailerId() || ctx.connections?.fluent_commerce?.retailerId,
367
- });
368
-
369
- logger.info('Job created', { jobId: job.id });
370
-
371
- // STEP 4: Process each chunk sequentially
372
- let successCount = 0;
373
- let failureCount = 0;
374
-
375
- for (const chunk of chunks) {
376
- try {
377
- // Skip if already processed
378
- const chunkState = await kvAdapter.get(['chunk', workflowId, chunk.chunkId, 'status']);
379
-
380
- if (chunkState?.value === 'completed') {
381
- logger.info('Chunk already processed, skipping', {
382
- chunkId: chunk.chunkId,
383
- });
384
- successCount++;
385
- continue;
386
- }
387
-
388
- // Mark chunk as processing
389
- await kvAdapter.set(['chunk', workflowId, chunk.chunkId, 'status'], 'processing');
390
-
391
- logger.info('Processing chunk', {
392
- chunkId: chunk.chunkId,
393
- recordCount: chunk.recordCount,
394
- progress: `${successCount + failureCount}/${chunks.length}`,
395
- });
396
-
397
- // Process chunk
398
- await processChunk(s3, client, job.id, chunk);
399
-
400
- // Mark chunk as completed
401
- await kvAdapter.set(['chunk', workflowId, chunk.chunkId], {
402
- ...chunk,
403
- status: 'completed',
404
- processedAt: new Date().toISOString(),
405
- } as ChunkMetadata);
406
-
407
- successCount++;
408
-
409
- logger.info('Chunk completed', {
410
- chunkId: chunk.chunkId,
411
- successCount,
412
- failureCount,
413
- percentComplete: (((successCount + failureCount) / chunks.length) * 100).toFixed(1),
414
- });
415
- } catch (error) {
416
- failureCount++;
417
- logger.error('Chunk processing failed', error as Error, {
418
- chunkId: chunk.chunkId,
419
- });
420
-
421
- // Mark chunk as failed
422
- await kvAdapter.set(['chunk', workflowId, chunk.chunkId], {
423
- ...chunk,
424
- status: 'failed',
425
- error: (error as Error).message,
426
- } as ChunkMetadata);
427
- }
428
- }
429
-
430
- // STEP 5: Update final state
431
- await stateService.updateSyncState(
432
- kvAdapter,
433
- [
434
- {
435
- fileName: SOURCE_FILE,
436
- lastModified: new Date().toISOString(),
437
- recordCount: chunks.reduce((sum, c) => sum + c.recordCount, 0),
438
- },
439
- ],
440
- workflowId
441
- );
442
-
443
- logger.info('Chunked ingestion complete', {
444
- totalChunks: chunks.length,
445
- successCount,
446
- failureCount,
447
- jobId: job.id,
448
- });
449
-
450
- return {
451
- success: failureCount === 0,
452
- jobId: job.id,
453
- chunksProcessed: successCount,
454
- chunksFailed: failureCount,
455
- totalChunks: chunks.length,
456
- };
457
- }
458
-
459
- /**
460
- * Split file into chunks and upload to S3
461
- */
462
- async function splitFileIntoChunks(
463
- s3: S3DataSource,
464
- sourceKey: string,
465
- chunkSize: number,
466
- workflowId: string,
467
- kv: VersoriKVAdapter
468
- ): Promise<ChunkMetadata[]> {
469
- const csvParser = new CSVParserService();
470
- const chunks: ChunkMetadata[] = [];
471
-
472
- // Download source file
473
- const fileContent = (await s3.downloadFile(sourceKey, {
474
- encoding: 'utf8',
475
- })) as string;
476
-
477
- let currentChunk: any[] = [];
478
- let chunkNumber = 0;
479
- let recordNumber = 0;
480
-
481
- // Stream through file and create chunks
482
- for await (const record of csvParser.parseStreaming(fileContent, {}, 1)) {
483
- currentChunk.push(record);
484
- recordNumber++;
485
-
486
- // Create chunk when size reached
487
- if (currentChunk.length >= chunkSize) {
488
- const chunkId = `chunk-${chunkNumber.toString().padStart(5, '0')}`;
489
- const chunkKey = `temp/${workflowId}/${chunkId}.csv`;
490
-
491
- // Convert chunk to CSV
492
- const chunkCSV = csvParser.stringify(currentChunk, { headers: true });
493
-
494
- // Upload chunk to S3
495
- await s3.uploadFile(chunkKey, chunkCSV, {
496
- contentType: 'text/csv',
497
- });
498
-
499
- // Create chunk metadata
500
- const metadata: ChunkMetadata = {
501
- chunkId,
502
- startRecord: recordNumber - currentChunk.length,
503
- endRecord: recordNumber - 1,
504
- s3Key: chunkKey,
505
- recordCount: currentChunk.length,
506
- status: 'pending',
507
- };
508
-
509
- chunks.push(metadata);
510
-
511
- // Store chunk metadata in KV
512
- await kv.set(['chunk', workflowId, chunkId], metadata);
513
-
514
- logger.info('Chunk created', {
515
- chunkId,
516
- recordCount: currentChunk.length,
517
- s3Key: chunkKey,
518
- });
519
-
520
- // Clear chunk (free memory)
521
- currentChunk = [];
522
- chunkNumber++;
523
- }
524
- }
525
-
526
- // Handle remaining records
527
- if (currentChunk.length > 0) {
528
- const chunkId = `chunk-${chunkNumber.toString().padStart(5, '0')}`;
529
- const chunkKey = `temp/${workflowId}/${chunkId}.csv`;
530
-
531
- const chunkCSV = csvParser.stringify(currentChunk, { headers: true });
532
- await s3.uploadFile(chunkKey, chunkCSV, { contentType: 'text/csv' });
533
-
534
- const metadata: ChunkMetadata = {
535
- chunkId,
536
- startRecord: recordNumber - currentChunk.length,
537
- endRecord: recordNumber - 1,
538
- s3Key: chunkKey,
539
- recordCount: currentChunk.length,
540
- status: 'pending',
541
- };
542
-
543
- chunks.push(metadata);
544
- await kv.set(['chunk', workflowId, chunkId], metadata);
545
- }
546
-
547
- return chunks;
548
- }
549
-
550
- /**
551
- * Process a single chunk
552
- */
553
- async function processChunk(
554
- s3: S3DataSource,
555
- client: any,
556
- jobId: string,
557
- chunk: ChunkMetadata
558
- ): Promise<void> {
559
- const csvParser = new CSVParserService();
560
- const mapper = new UniversalMapper({
561
- fields: {
562
- skuRef: { source: 'sku', required: true },
563
- locationRef: { source: 'location_code', required: true },
564
- qty: { source: 'quantity', resolver: 'sdk.parseInt' },
565
- expectedOn: { source: 'expected_date', resolver: 'sdk.formatDate' },
566
- },
567
- });
568
-
569
- // Download chunk
570
- const chunkContent = (await s3.downloadFile(chunk.s3Key, {
571
- encoding: 'utf8',
572
- })) as string;
573
-
574
- // Parse chunk
575
- const records = await csvParser.parse(chunkContent);
576
-
577
- // Map records
578
- const entities: any[] = [];
579
- for (const record of records) {
580
- const mapped = await mapper.map(record);
581
- if (mapped.success && mapped.data) {
582
- entities.push(mapped.data);
583
- }
584
- }
585
-
586
- // Send batch
587
- await client.sendBatch(jobId, { entities });
588
-
589
- logger.info('Chunk batch sent', {
590
- chunkId: chunk.chunkId,
591
- entityCount: entities.length,
592
- });
593
- }
594
- ```
595
-
596
- **VersoriKV Schema:**
597
-
598
- ```typescript
599
- // Chunk metadata
600
- ['chunk', workflowId, chunkId] => ChunkMetadata
601
-
602
- // Chunk status
603
- ['chunk', workflowId, chunkId, 'status'] => 'pending' | 'processing' | 'completed' | 'failed'
604
-
605
- // Workflow state
606
- ['state', workflowId, 'sync'] => SyncState
607
- ```
608
-
609
- **Performance:**
610
-
611
- ```
612
- File Size: 5GB (10M records)
613
- Chunk Size: 100K records
614
- Total Chunks: 100
615
- Processing Time: ~60 minutes (sequential)
616
- RAM Usage: ~100MB (processes one chunk at a time)
617
- ```
618
-
619
- ---
620
-
621
- ## Pattern 3: Parallel Processing (High Performance)
622
-
623
- **Best for:** 5M-10M records, time-sensitive ingestion, need speed with reliability
624
-
625
- **Strategy:**
626
-
627
- 1. Split file into chunks (same as Pattern 2)
628
- 2. Spawn 5 parallel Batch API jobs
629
- 3. Process chunks concurrently
630
- 4. Track progress in VersoriKV
631
- 5. Resume on failure
632
-
633
- ### Implementation
634
-
635
- ```typescript
636
- import {
637
- createClient,
638
- CSVParserService,
639
- S3DataSource,
640
- UniversalMapper,
641
- StateService,
642
- VersoriKVAdapter,
643
- createConsoleLogger,
644
- toStructuredLogger
645
- } from '@fluentcommerce/fc-connect-sdk';
646
-
647
- const logger = createConsoleLogger();
648
-
649
- interface ParallelJob {
650
- jobId: string;
651
- assignedChunks: string[];
652
- status: 'pending' | 'processing' | 'completed' | 'failed';
653
- recordsProcessed: number;
654
- startedAt?: string;
655
- completedAt?: string;
656
- }
657
-
658
- async function parallelIngestion(ctx: any) {
659
- logger.info('Starting parallel ingestion');
660
-
661
- // Initialize services
662
- const client = await createClient(ctx);
663
-
664
- const s3 = new S3DataSource(
665
- {
666
- type: 'S3_CSV',
667
- connectionId: 'my-s3-parallel',
668
- name: 'Inventory Files S3 Parallel',
669
- s3Config: {
670
- bucket: 'inventory-files',
671
- region: 'us-east-1',
672
- accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
673
- secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
674
- },
675
- },
676
- logger
677
- );
678
-
679
- const kv = context.openKv();
680
- const kvAdapter = new VersoriKVAdapter(kv);
681
- const stateService = new StateService(logger);
682
-
683
- const SOURCE_FILE = 'inventory/huge-inventory.csv';
684
- const CHUNK_SIZE = 100000; // 100K records per chunk
685
- const PARALLEL_JOBS = 5; // Process 5 chunks concurrently
686
- const workflowId = 'parallel-ingestion';
687
-
688
- // STEP 1: Split file into chunks (reuse from Pattern 2)
689
- const chunks = await splitFileIntoChunks(s3, SOURCE_FILE, CHUNK_SIZE, workflowId, kvAdapter);
690
-
691
- logger.info('File split complete', {
692
- totalChunks: chunks.length,
693
- totalRecords: chunks.reduce((sum, c) => sum + c.recordCount, 0),
694
- });
695
-
696
- // STEP 2: Create multiple jobs for parallel processing
697
- const jobs: ParallelJob[] = [];
698
-
699
- for (let i = 0; i < PARALLEL_JOBS; i++) {
700
- const job = await client.createJob({
701
- name: `parallel-inventory-ingestion-job-${i + 1}`,
702
- retailerId: client.getRetailerId() || ctx.connections?.fluent_commerce?.retailerId,
703
- });
704
-
705
- jobs.push({
706
- jobId: job.id,
707
- assignedChunks: [],
708
- status: 'pending',
709
- recordsProcessed: 0,
710
- });
711
-
712
- logger.info('Parallel job created', {
713
- jobNumber: i + 1,
714
- jobId: job.id,
715
- });
716
- }
717
-
718
- // STEP 3: Distribute chunks across jobs (round-robin)
719
- chunks.forEach((chunk, index) => {
720
- const jobIndex = index % PARALLEL_JOBS;
721
- jobs[jobIndex].assignedChunks.push(chunk.chunkId);
722
- });
723
-
724
- logger.info('Chunks distributed', {
725
- totalChunks: chunks.length,
726
- jobCount: PARALLEL_JOBS,
727
- chunksPerJob: jobs.map(j => j.assignedChunks.length),
728
- });
729
-
730
- // STEP 4: Process chunks in parallel
731
- const startTime = Date.now();
732
-
733
- const jobPromises = jobs.map((job, jobIndex) =>
734
- processJobChunks(
735
- s3,
736
- client,
737
- job,
738
- chunks.filter(c => job.assignedChunks.includes(c.chunkId)),
739
- workflowId,
740
- kvAdapter,
741
- jobIndex + 1
742
- )
743
- );
744
-
745
- // Wait for all jobs to complete
746
- const results = await Promise.allSettled(jobPromises);
747
- const duration = (Date.now() - startTime) / 1000;
748
-
749
- // STEP 5: Analyze results
750
- let successfulJobs = 0;
751
- let failedJobs = 0;
752
- let totalRecordsProcessed = 0;
753
-
754
- results.forEach((result, index) => {
755
- if (result.status === 'fulfilled') {
756
- successfulJobs++;
757
- totalRecordsProcessed += result.value.recordsProcessed;
758
-
759
- logger.info('Job completed', {
760
- jobNumber: index + 1,
761
- jobId: jobs[index].jobId,
762
- recordsProcessed: result.value.recordsProcessed,
763
- chunksProcessed: result.value.chunksProcessed,
764
- });
765
- } else {
766
- failedJobs++;
767
- logger.error('Job failed', result.reason, {
768
- jobNumber: index + 1,
769
- jobId: jobs[index].jobId,
770
- });
771
- }
772
- });
773
-
774
- // STEP 6: Update final state
775
- await stateService.updateSyncState(
776
- kvAdapter,
777
- [
778
- {
779
- fileName: SOURCE_FILE,
780
- lastModified: new Date().toISOString(),
781
- recordCount: totalRecordsProcessed,
782
- },
783
- ],
784
- workflowId
785
- );
786
-
787
- logger.info('Parallel ingestion complete', {
788
- totalChunks: chunks.length,
789
- parallelJobs: PARALLEL_JOBS,
790
- successfulJobs,
791
- failedJobs,
792
- totalRecordsProcessed,
793
- durationSeconds: duration,
794
- recordsPerSecond: Math.round(totalRecordsProcessed / duration),
795
- });
796
-
797
- return {
798
- success: failedJobs === 0,
799
- totalChunks: chunks.length,
800
- totalRecordsProcessed,
801
- successfulJobs,
802
- failedJobs,
803
- durationSeconds: duration,
804
- recordsPerSecond: Math.round(totalRecordsProcessed / duration),
805
- };
806
- }
807
-
808
- /**
809
- * Process all chunks assigned to a job
810
- */
811
- async function processJobChunks(
812
- s3: S3DataSource,
813
- client: any,
814
- job: ParallelJob,
815
- chunks: ChunkMetadata[],
816
- workflowId: string,
817
- kv: VersoriKVAdapter,
818
- jobNumber: number
819
- ): Promise<{ recordsProcessed: number; chunksProcessed: number }> {
820
- logger.info(`Job ${jobNumber} starting`, {
821
- jobId: job.jobId,
822
- assignedChunks: chunks.length,
823
- });
824
-
825
- let recordsProcessed = 0;
826
- let chunksProcessed = 0;
827
-
828
- for (const chunk of chunks) {
829
- try {
830
- // Check if chunk already processed
831
- const chunkState = await kv.get(['chunk', workflowId, chunk.chunkId, 'status']);
832
-
833
- if (chunkState?.value === 'completed') {
834
- logger.info(`Job ${jobNumber}: Chunk already processed`, {
835
- chunkId: chunk.chunkId,
836
- });
837
- chunksProcessed++;
838
- continue;
839
- }
840
-
841
- // Mark chunk as processing
842
- await kv.set(['chunk', workflowId, chunk.chunkId, 'status'], 'processing');
843
-
844
- logger.info(`Job ${jobNumber}: Processing chunk`, {
845
- chunkId: chunk.chunkId,
846
- recordCount: chunk.recordCount,
847
- progress: `${chunksProcessed}/${chunks.length}`,
848
- });
849
-
850
- // Process chunk
851
- await processChunk(s3, client, job.jobId, chunk);
852
-
853
- // Mark chunk as completed
854
- await kv.set(['chunk', workflowId, chunk.chunkId], {
855
- ...chunk,
856
- status: 'completed',
857
- processedAt: new Date().toISOString(),
858
- } as ChunkMetadata);
859
-
860
- recordsProcessed += chunk.recordCount;
861
- chunksProcessed++;
862
-
863
- logger.info(`Job ${jobNumber}: Chunk completed`, {
864
- chunkId: chunk.chunkId,
865
- recordsProcessed,
866
- chunksProcessed,
867
- percentComplete: ((chunksProcessed / chunks.length) * 100).toFixed(1),
868
- });
869
- } catch (error) {
870
- logger.error(`Job ${jobNumber}: Chunk failed`, error as Error, {
871
- chunkId: chunk.chunkId,
872
- });
873
-
874
- // Mark chunk as failed (don't throw - continue with remaining chunks)
875
- await kv.set(['chunk', workflowId, chunk.chunkId], {
876
- ...chunk,
877
- status: 'failed',
878
- error: (error as Error).message,
879
- } as ChunkMetadata);
880
- }
881
- }
882
-
883
- logger.info(`Job ${jobNumber} completed`, {
884
- jobId: job.jobId,
885
- recordsProcessed,
886
- chunksProcessed,
887
- });
888
-
889
- return { recordsProcessed, chunksProcessed };
890
- }
891
- ```
892
-
893
- **Progress Tracking:**
894
-
895
- ```typescript
896
- // Real-time progress query
897
- async function getIngestionProgress(
898
- workflowId: string,
899
- kv: VersoriKVAdapter
900
- ): Promise<{
901
- totalChunks: number;
902
- completedChunks: number;
903
- failedChunks: number;
904
- processingChunks: number;
905
- percentComplete: number;
906
- }> {
907
- // This would query all chunk statuses from KV
908
- // Simplified example:
909
- const chunks = await getAllChunkMetadata(workflowId, kv);
910
-
911
- const completed = chunks.filter(c => c.status === 'completed').length;
912
- const failed = chunks.filter(c => c.status === 'failed').length;
913
- const processing = chunks.filter(c => c.status === 'processing').length;
914
-
915
- return {
916
- totalChunks: chunks.length,
917
- completedChunks: completed,
918
- failedChunks: failed,
919
- processingChunks: processing,
920
- percentComplete: (completed / chunks.length) * 100,
921
- };
922
- }
923
- ```
924
-
925
- **Performance:**
926
-
927
- ```
928
- File Size: 5GB (10M records)
929
- Chunk Size: 100K records
930
- Total Chunks: 100
931
- Parallel Jobs: 5
932
- Processing Time: ~15 minutes (4x speedup)
933
- RAM Usage: ~500MB (5 chunks in parallel)
934
- Throughput: ~11,111 records/second
935
- ```
936
-
937
- ---
938
-
939
- ## Pattern 4: Distributed Processing (Versori Workflows)
940
-
941
- **Best for:** 10M+ records, enterprise scale, need maximum reliability and observability
942
-
943
- **Strategy:**
944
-
945
- 1. Coordinator workflow splits file and creates scheduled tasks
946
- 2. Each worker workflow processes one chunk
947
- 3. Coordinator tracks completion via VersoriKV
948
- 4. Automatic retry on worker failure
949
-
950
- ### Coordinator Workflow
951
-
952
- ```typescript
953
- import { fn, schedule } from '@versori/run';
954
- import {
955
- createClient,
956
- S3DataSource,
957
- VersoriKVAdapter,
958
- createConsoleLogger,
959
- toStructuredLogger
960
- } from '@fluentcommerce/fc-connect-sdk';
961
-
962
- const logger = createConsoleLogger();
963
-
964
- /**
965
- * Coordinator workflow - splits file and spawns workers
966
- */
967
- export const coordinatorWorkflow = schedule('coordinator')
968
- .cron('0 2 * * *') // Run daily at 2 AM
969
- .then(
970
- fn('split-and-schedule', async ({ activation, connections, kv }) => {
971
- logger.info('Coordinator: Starting distributed ingestion');
972
-
973
- const s3 = new S3DataSource(
974
- {
975
- type: 'S3_CSV',
976
- connectionId: 'my-s3-3',
977
- name: 'Inventory Files S3 3',
978
- s3Config: {
979
- bucket: 'inventory-files',
980
- region: 'us-east-1',
981
- accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
982
- secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
983
- },
984
- },
985
- logger
986
- );
987
-
988
- const kvAdapter = new VersoriKVAdapter(kv);
989
- const workflowId = `distributed-${Date.now()}`;
990
- const SOURCE_FILE = 'inventory/enterprise-inventory.csv';
991
- const CHUNK_SIZE = 100000;
992
-
993
- // Split file into chunks
994
- const chunks = await splitFileIntoChunks(s3, SOURCE_FILE, CHUNK_SIZE, workflowId, kvAdapter);
995
-
996
- logger.info('Coordinator: File split complete', {
997
- totalChunks: chunks.length,
998
- workflowId,
999
- });
1000
-
1001
- // Store coordinator state
1002
- await kvAdapter.set(['coordinator', workflowId], {
1003
- workflowId,
1004
- sourceFile: SOURCE_FILE,
1005
- totalChunks: chunks.length,
1006
- status: 'scheduled',
1007
- createdAt: new Date().toISOString(),
1008
- });
1009
-
1010
- // Schedule worker for each chunk
1011
- for (const chunk of chunks) {
1012
- // Trigger worker workflow (Versori will handle scheduling)
1013
- await activation.triggerWorkflow('chunk-worker', {
1014
- workflowId,
1015
- chunkId: chunk.chunkId,
1016
- chunkKey: chunk.s3Key,
1017
- recordCount: chunk.recordCount,
1018
- });
1019
-
1020
- logger.info('Coordinator: Worker scheduled', {
1021
- chunkId: chunk.chunkId,
1022
- workflowId,
1023
- });
1024
- }
1025
-
1026
- return {
1027
- workflowId,
1028
- totalChunks: chunks.length,
1029
- message: `Scheduled ${chunks.length} worker workflows`,
1030
- };
1031
- })
1032
- );
1033
-
1034
- /**
1035
- * Monitor workflow - checks completion status
1036
- */
1037
- export const monitorWorkflow = schedule('monitor')
1038
- .cron('*/5 * * * *') // Run every 5 minutes
1039
- .then(
1040
- fn('check-progress', async ({ kv }) => {
1041
- const kvAdapter = new VersoriKVAdapter(kv);
1042
-
1043
- // Get all active coordinators
1044
- const coordinators = await getActiveCoordinators(kvAdapter);
1045
-
1046
- for (const coordinator of coordinators) {
1047
- const progress = await getIngestionProgress(coordinator.workflowId, kvAdapter);
1048
-
1049
- logger.info('Monitor: Progress update', {
1050
- workflowId: coordinator.workflowId,
1051
- ...progress,
1052
- });
1053
-
1054
- // Check if complete
1055
- if (progress.completedChunks + progress.failedChunks === progress.totalChunks) {
1056
- // Mark coordinator as complete
1057
- await kvAdapter.set(['coordinator', coordinator.workflowId], {
1058
- ...coordinator,
1059
- status: 'completed',
1060
- completedAt: new Date().toISOString(),
1061
- progress,
1062
- });
1063
-
1064
- logger.info('Monitor: Ingestion complete', {
1065
- workflowId: coordinator.workflowId,
1066
- ...progress,
1067
- });
1068
- }
1069
- }
1070
-
1071
- return { coordinatorsChecked: coordinators.length };
1072
- })
1073
- );
1074
- ```
1075
-
1076
- ### Worker Workflow
1077
-
1078
- ```typescript
1079
- import { fn, webhook } from '@versori/run';
1080
- import {
1081
- createClient,
1082
- S3DataSource,
1083
- CSVParserService,
1084
- UniversalMapper,
1085
- VersoriKVAdapter,
1086
- createConsoleLogger,
1087
- toStructuredLogger
1088
- } from '@fluentcommerce/fc-connect-sdk';
1089
-
1090
- const logger = createConsoleLogger();
1091
-
1092
- /**
1093
- * Worker workflow - processes a single chunk
1094
- */
1095
- export const chunkWorker = webhook('chunk-worker').then(
1096
- fn('process-chunk', async ({ data, activation, connections, kv }) => {
1097
- const { workflowId, chunkId, chunkKey, recordCount } = data;
1098
-
1099
- logger.info('Worker: Starting chunk processing', {
1100
- workflowId,
1101
- chunkId,
1102
- recordCount,
1103
- });
1104
-
1105
- const kvAdapter = new VersoriKVAdapter(kv);
1106
-
1107
- // Check if already processed
1108
- const chunkState = await kvAdapter.get(['chunk', workflowId, chunkId, 'status']);
1109
-
1110
- if (chunkState?.value === 'completed') {
1111
- logger.info('Worker: Chunk already processed', { chunkId });
1112
- return { chunkId, status: 'skipped', message: 'Already processed' };
1113
- }
1114
-
1115
- // Mark as processing
1116
- await kvAdapter.set(['chunk', workflowId, chunkId, 'status'], 'processing');
1117
-
1118
- try {
1119
- // Initialize services
1120
- const client = await createClient(ctx);
1121
-
1122
- const s3 = new S3DataSource(
1123
- {
1124
- type: 'S3_CSV',
1125
- connectionId: 'my-s3-4',
1126
- name: 'Inventory Files S3 4',
1127
- s3Config: {
1128
- bucket: 'inventory-files',
1129
- region: 'us-east-1',
1130
- accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
1131
- secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
1132
- },
1133
- },
1134
- logger
1135
- );
1136
-
1137
- const csvParser = new CSVParserService();
1138
- const mapper = new UniversalMapper({
1139
- fields: {
1140
- skuRef: { source: 'sku', required: true },
1141
- locationRef: { source: 'location_code', required: true },
1142
- qty: { source: 'quantity', resolver: 'sdk.parseInt' },
1143
- expectedOn: { source: 'expected_date', resolver: 'sdk.formatDate' },
1144
- },
1145
- });
1146
-
1147
- // Get or create job for this workflow
1148
- let jobId = await kvAdapter.get(['job', workflowId, 'jobId']);
1149
-
1150
- if (!jobId?.value) {
1151
- const job = await client.createJob({
1152
- name: `distributed-ingestion-${workflowId}`,
1153
- retailerId: client.getRetailerId() || ctx.connections?.fluent_commerce?.retailerId,
1154
- });
1155
-
1156
- await kvAdapter.set(['job', workflowId, 'jobId'], job.id);
1157
- jobId = { value: job.id };
1158
- }
1159
-
1160
- // Download chunk
1161
- const chunkContent = (await s3.downloadFile(chunkKey, {
1162
- encoding: 'utf8',
1163
- })) as string;
1164
-
1165
- // Parse chunk
1166
- const records = await csvParser.parse(chunkContent);
1167
-
1168
- // Map records
1169
- const entities: any[] = [];
1170
- for (const record of records) {
1171
- const mapped = await mapper.map(record);
1172
- if (mapped.success && mapped.data) {
1173
- entities.push(mapped.data);
1174
- }
1175
- }
1176
-
1177
- // Send batch
1178
- await client.sendBatch(jobId.value as string, { entities });
1179
-
1180
- // Mark as completed
1181
- await kvAdapter.set(['chunk', workflowId, chunkId], {
1182
- chunkId,
1183
- s3Key: chunkKey,
1184
- recordCount: entities.length,
1185
- status: 'completed',
1186
- processedAt: new Date().toISOString(),
1187
- });
1188
-
1189
- logger.info('Worker: Chunk completed', {
1190
- workflowId,
1191
- chunkId,
1192
- recordCount: entities.length,
1193
- });
1194
-
1195
- return {
1196
- chunkId,
1197
- status: 'completed',
1198
- recordsProcessed: entities.length,
1199
- };
1200
- } catch (error) {
1201
- logger.error('Worker: Chunk failed', error as Error, {
1202
- workflowId,
1203
- chunkId,
1204
- });
1205
-
1206
- // Mark as failed
1207
- await kvAdapter.set(['chunk', workflowId, chunkId], {
1208
- chunkId,
1209
- s3Key: chunkKey,
1210
- recordCount,
1211
- status: 'failed',
1212
- error: (error as Error).message,
1213
- });
1214
-
1215
- throw error;
1216
- }
1217
- })
1218
- );
1219
- ```
1220
-
1221
- **Performance:**
1222
-
1223
- ```
1224
- File Size: 10GB (20M records)
1225
- Chunk Size: 100K records
1226
- Total Chunks: 200
1227
- Worker Workflows: 200 (parallel)
1228
- Processing Time: ~10 minutes (Versori handles parallelism)
1229
- RAM Usage: ~50MB per worker
1230
- Throughput: ~33,333 records/second
1231
- ```
1232
-
1233
- ---
1234
-
1235
- ## Memory Optimization Tips
1236
-
1237
- ### 1. Use Streaming APIs
1238
-
1239
- ```typescript
1240
- // ❌ WRONG - Loads entire file into memory
1241
- const fileContent = await fs.readFile('huge.csv', 'utf-8');
1242
- const records = await csvParser.parse(fileContent);
1243
-
1244
- // ✅ CORRECT - Streams records incrementally
1245
- for await (const record of csvParser.parseStreaming(fileContent, {}, 1)) {
1246
- await processRecord(record);
1247
- }
1248
- ```
1249
-
1250
- ### 2. Clear Batches After Processing
1251
-
1252
- ```typescript
1253
- let batch: any[] = [];
1254
- for await (const record of records) {
1255
- batch.push(record);
1256
-
1257
- if (batch.length >= 1000) {
1258
- await sendBatch(batch);
1259
- batch = []; // ✅ Clear batch to free memory
1260
- }
1261
- }
1262
- ```
1263
-
1264
- ### 3. Monitor Memory Usage
1265
-
1266
- ```typescript
1267
- function logMemoryUsage() {
1268
- const used = process.memoryUsage();
1269
- console.log({
1270
- heapUsed: Math.round(used.heapUsed / 1024 / 1024) + ' MB',
1271
- heapTotal: Math.round(used.heapTotal / 1024 / 1024) + ' MB',
1272
- rss: Math.round(used.rss / 1024 / 1024) + ' MB',
1273
- });
1274
- }
1275
-
1276
- // Log every 10K records
1277
- if (recordsProcessed % 10000 === 0) {
1278
- logMemoryUsage();
1279
- }
1280
- ```
1281
-
1282
- ### 4. Use Garbage Collection Hints
1283
-
1284
- ```typescript
1285
- // Force garbage collection (requires --expose-gc flag)
1286
- if (recordsProcessed % 100000 === 0 && global.gc) {
1287
- global.gc();
1288
- logger.info('Garbage collection triggered', { recordsProcessed });
1289
- }
1290
- ```
1291
-
1292
- ---
1293
-
1294
- ## Performance Benchmarks
1295
-
1296
- ### Pattern Comparison (10M records, 5GB file)
1297
-
1298
- | Pattern | Time | RAM | Throughput | Complexity |
1299
- | ------------------------- | ------ | ------ | -------------- | ---------- |
1300
- | 1. Basic Streaming | 90 min | 50MB | 1,852 rec/sec | Low |
1301
- | 2. File Chunking | 60 min | 100MB | 2,778 rec/sec | Medium |
1302
- | 3. Parallel Processing | 15 min | 500MB | 11,111 rec/sec | High |
1303
- | 4. Distributed Processing | 10 min | 50MB\* | 16,667 rec/sec | Very High |
1304
-
1305
- \*Per worker; total RAM = 50MB × worker count
1306
-
1307
- ### Optimization Impact
1308
-
1309
- | Optimization | Before | After | Improvement |
1310
- | ------------------------- | ------- | -------- | ----------- |
1311
- | Streaming vs Loading | 5GB RAM | 50MB RAM | 100x |
1312
- | Batching (1K vs 10K) | 90 min | 60 min | 1.5x |
1313
- | Parallel (1 vs 5 jobs) | 60 min | 15 min | 4x |
1314
- | Distributed (200 workers) | 15 min | 10 min | 1.5x |
1315
-
1316
- ---
1317
-
1318
- ## Common Issues & Solutions
1319
-
1320
- ### Issue 1: Out of Memory
1321
-
1322
- **Symptoms:**
1323
-
1324
- ```
1325
- FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
1326
- ```
1327
-
1328
- **Solutions:**
1329
-
1330
- 1. Switch to streaming pattern (Pattern 1)
1331
- 2. Reduce batch size (1000 => 500)
1332
- 3. Increase Node.js heap: `node --max-old-space-size=4096`
1333
- 4. Use file chunking (Pattern 2)
1334
-
1335
- ### Issue 2: Timeout on Large Files
1336
-
1337
- **Symptoms:**
1338
-
1339
- ```
1340
- TimeoutError: Operation timed out after 300000ms
1341
- ```
1342
-
1343
- **Solutions:**
1344
-
1345
- 1. Increase timeout: `config.timeout = 600000` (10 min)
1346
- 2. Split file into chunks (Pattern 2)
1347
- 3. Use parallel processing (Pattern 3)
1348
-
1349
- ### Issue 3: Chunks Not Resuming
1350
-
1351
- **Symptoms:**
1352
-
1353
- - Re-processing same chunks on failure
1354
-
1355
- **Solutions:**
1356
-
1357
- ```typescript
1358
- // Check chunk status before processing
1359
- const chunkState = await kv.get(['chunk', workflowId, chunkId, 'status']);
1360
- if (chunkState?.value === 'completed') {
1361
- logger.info('Chunk already processed, skipping', { chunkId });
1362
- continue;
1363
- }
1364
- ```
1365
-
1366
- ### Issue 4: Progress Tracking Inconsistent
1367
-
1368
- **Symptoms:**
1369
-
1370
- - Progress percentage doesn't match reality
1371
-
1372
- **Solutions:**
1373
-
1374
- ```typescript
1375
- // Always update chunk status atomically
1376
- const atomic = kv.atomic();
1377
- atomic.set(['chunk', workflowId, chunkId, 'status'], 'completed');
1378
- atomic.set(['chunk', workflowId, chunkId, 'processedAt'], new Date().toISOString());
1379
- await atomic.commit();
1380
- ```
1381
-
1382
- ### Issue 5: Duplicate Processing
1383
-
1384
- **Symptoms:**
1385
-
1386
- - Same records sent multiple times
1387
-
1388
- **Solutions:**
1389
-
1390
- ```typescript
1391
- // Use idempotency keys in Fluent batch payload
1392
- await client.sendBatch(jobId, {
1393
- entities,
1394
- meta: {
1395
- chunkId: chunk.chunkId,
1396
- workflowId,
1397
- idempotencyKey: `${workflowId}-${chunk.chunkId}`,
1398
- },
1399
- });
1400
- ```
1401
-
1402
- ---
1403
-
1404
- ## Related Guides
1405
-
1406
- - [Basic Ingestion Pattern](../standalone/s3-csv-batch-api.md) - For small files (<100K records)
1407
- - [Streaming Pattern](../../02-CORE-GUIDES/ingestion/ingestion-readme.md) - For medium files (100K-1M records)
1408
- - [Error Handling & Retry](./error-handling-retry.md) - Robust error handling strategies
1409
- - [Progress Tracking](../../02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-08-performance-optimization.md) - Real-time progress monitoring
1410
- - [State Management](../../02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-07-state-management.md) - VersoriKV patterns
1411
-
1412
- ---
1413
-
1414
- ## Summary
1415
-
1416
- **Choose Your Pattern:**
1417
-
1418
- - **Pattern 1 (Streaming)**: Simple, memory-efficient, suitable for 100K-1M records
1419
- - **Pattern 2 (Chunking)**: Checkpoint/resume, suitable for 1M-5M records
1420
- - **Pattern 3 (Parallel)**: High performance, suitable for 5M-10M records
1421
- - **Pattern 4 (Distributed)**: Enterprise scale, suitable for 10M+ records
1422
-
1423
- **Key Takeaways:**
1424
-
1425
- 1. Always use streaming APIs for large files
1426
- 2. Clear batches after processing to free memory
1427
- 3. Use chunks + VersoriKV for checkpoint/resume
1428
- 4. Parallel processing trades RAM for speed
1429
- 5. Monitor memory usage throughout processing
1430
- 6. Test with representative file sizes before production
1
+ # Pattern: Large File Processing & Chunking
2
+
3
+ **FC Connect SDK Use Case Guide**
4
+
5
+ > **SDK**: [@fluentcommerce/fc-connect-sdk](https://www.npmjs.com/package/@fluentcommerce/fc-connect-sdk)
6
+ > **Version**: Use latest - `npm install @fluentcommerce/fc-connect-sdk@latest`
7
+
8
+ **Context**: Enterprise-scale file ingestion with streaming, splitting, and parallel processing
9
+
10
+ **Type**: Advanced Pattern
11
+
12
+ **Complexity**: High
13
+
14
+ **Volume**: 500MB-5GB files, 1M-10M records
15
+
16
+ **Latency**: Batch processing (< 30-60 min for 10M records)
17
+
18
+ **Pattern**: Streaming + chunking + parallel Batch API
19
+
20
+ ## When to Use This Pattern
21
+
22
+ Use this pattern when dealing with:
23
+
24
+ - **Large CSV files** (>500MB, >1M records)
25
+ - **Memory-constrained environments** (Lambda, containers with limited RAM)
26
+ - **Time-sensitive ingestion** (need parallel processing for speed)
27
+ - **Reliability requirements** (checkpoint/resume on failure)
28
+ - **Progress tracking** (real-time status updates)
29
+
30
+ **Volume Guidance:**
31
+
32
+ - **Small** (<1K records): Use basic ingestion pattern
33
+ - **Medium** (1K-100K records): Use streaming pattern (Pattern 1)
34
+ - **Large** (100K-1M records): Use file chunking pattern (Pattern 2)
35
+ - **Huge** (1M-10M records): Use parallel processing pattern (Pattern 3)
36
+ - **Enterprise** (10M+ records): Use distributed processing pattern (Pattern 4)
37
+
38
+ ## Problem Statement
39
+
40
+ ### Why Splitting is Needed
41
+
42
+ **Memory Constraints:**
43
+
44
+ ```typescript
45
+ // ❌ WRONG - Loads entire 2GB file into memory
46
+ const csvContent = await fs.readFile('huge-inventory.csv', 'utf-8');
47
+ const records = await csvParser.parse(csvContent); // 💥 Out of memory
48
+ ```
49
+
50
+ **Impact:**
51
+
52
+ - Lambda 512MB: Crashes on 500MB+ files
53
+ - Container 1GB: Struggles with 1GB+ files
54
+ - Node.js default heap (4GB): Fails on 5GB+ files
55
+
56
+ **Time Constraints:**
57
+
58
+ ```typescript
59
+ // ❌ WRONG - Sequential processing takes 90+ minutes
60
+ for (const record of records) {
61
+ await processRecord(record); // Too slow for 10M records
62
+ }
63
+ ```
64
+
65
+ **Reliability Requirements:**
66
+
67
+ ```typescript
68
+ // ❌ WRONG - Network failure loses all progress
69
+ await processAllRecords(records); // If fails at record 5M, restart from 0
70
+ ```
71
+
72
+ ### Solution Overview
73
+
74
+ This guide demonstrates 4 progressive patterns:
75
+
76
+ 1. **Basic Streaming** (~200 lines) - Process records as they arrive, memory-efficient
77
+ 2. **File Chunking** (~300 lines) - Split large files into manageable chunks
78
+ 3. **Parallel Processing** (~400 lines) - Process chunks concurrently with progress tracking
79
+ 4. **Distributed Processing** (~300 lines) - Use Versori scheduled workflows for enterprise scale
80
+
81
+ ## SDK Methods Used
82
+
83
+ ```typescript
84
+ import {
85
+ createClient,
86
+ // Client factory (auto-detects context)
87
+ CSVParserService,
88
+ // Streaming CSV parser
89
+ S3DataSource,
90
+ // S3 file operations
91
+ UniversalMapper,
92
+ // Field mapping
93
+ StateService,
94
+ // Progress tracking
95
+ VersoriKVAdapter,
96
+ // Versori state management,
97
+ // Structured logging,
98
+ createConsoleLogger,
99
+ toStructuredLogger
100
+ } from '@fluentcommerce/fc-connect-sdk';
101
+ ```
102
+
103
+ ---
104
+
105
+ ## Pattern 1: Basic Streaming (Memory-Efficient)
106
+
107
+ **Best for:** 100K-1M records, single-threaded processing, memory-constrained environments
108
+
109
+ **Memory Usage:**
110
+
111
+ - ❌ Without streaming: 2GB file = 2GB+ RAM (file + parsed objects)
112
+ - ✅ With streaming: 2GB file = ~50MB RAM (processes records incrementally)
113
+
114
+ ### Implementation
115
+
116
+ ```typescript
117
+ import {
118
+ createClient,
119
+ CSVParserService,
120
+ S3DataSource,
121
+ UniversalMapper,
122
+ createConsoleLogger,
123
+ toStructuredLogger
124
+ } from '@fluentcommerce/fc-connect-sdk';
125
+
126
+ const logger = createConsoleLogger();
127
+
128
+ async function streamingIngestion(ctx: any) {
129
+ logger.info('Starting streaming ingestion');
130
+
131
+ // Create client (auto-detects Versori context)
132
+ const client = await createClient(ctx);
133
+
134
+ // Initialize S3 data source
135
+ const s3 = new S3DataSource(
136
+ {
137
+ type: 'S3_CSV',
138
+ connectionId: 'my-s3',
139
+ name: 'Inventory Files S3',
140
+ s3Config: {
141
+ bucket: 'inventory-files',
142
+ region: 'us-east-1',
143
+ accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
144
+ secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
145
+ },
146
+ },
147
+ logger
148
+ );
149
+
150
+ // Define field mapping
151
+ const mapper = new UniversalMapper({
152
+ fields: {
153
+ skuRef: { source: 'sku', required: true },
154
+ locationRef: { source: 'location_code', required: true },
155
+ qty: { source: 'quantity', resolver: 'sdk.parseInt' },
156
+ expectedOn: { source: 'expected_date', resolver: 'sdk.formatDate' },
157
+ },
158
+ });
159
+
160
+ // Create CSV parser with streaming enabled
161
+ const csvParser = new CSVParserService();
162
+
163
+ // Download file as stream (not loaded into memory)
164
+ logger.info('Downloading file from S3', {
165
+ key: 'inventory/large-file.csv',
166
+ });
167
+
168
+ const fileContent = (await s3.downloadFile('inventory/large-file.csv', {
169
+ encoding: 'utf8',
170
+ })) as string;
171
+
172
+ // Create job for batch ingestion
173
+ const job = await client.createJob({
174
+ name: 'streaming-inventory-ingestion',
175
+ retailerId: client.getRetailerId() || ctx.connections?.fluent_commerce?.retailerId,
176
+ });
177
+
178
+ logger.info('Job created', { jobId: job.id });
179
+
180
+ // Statistics tracking
181
+ let recordsProcessed = 0;
182
+ let batchCount = 0;
183
+ let errors = 0;
184
+ const BATCH_SIZE = 1000;
185
+ let currentBatch: any[] = [];
186
+
187
+ // Stream records with batching (memory-efficient)
188
+ // Records are parsed incrementally, not all at once
189
+ for await (const record of csvParser.parseStreaming(fileContent, {}, 1)) {
190
+ try {
191
+ // Map record
192
+ const mapped = await mapper.map(record);
193
+
194
+ if (mapped.success && mapped.data) {
195
+ currentBatch.push(mapped.data);
196
+ recordsProcessed++;
197
+
198
+ // Send batch when full
199
+ if (currentBatch.length >= BATCH_SIZE) {
200
+ await client.sendBatch(job.id, {
201
+ entities: currentBatch,
202
+ });
203
+
204
+ batchCount++;
205
+
206
+ logger.info('Batch sent', {
207
+ batchNumber: batchCount,
208
+ recordsProcessed,
209
+ currentBatchSize: currentBatch.length,
210
+ });
211
+
212
+ currentBatch = []; // Clear batch (frees memory)
213
+ }
214
+ } else {
215
+ errors++;
216
+ logger.warn('Record mapping failed', {
217
+ record,
218
+ errors: mapped.errors,
219
+ });
220
+ }
221
+ } catch (error) {
222
+ errors++;
223
+ logger.error('Record processing failed', error as Error, { record });
224
+ }
225
+
226
+ // Progress logging every 10K records
227
+ if (recordsProcessed % 10000 === 0) {
228
+ logger.info('Progress update', {
229
+ recordsProcessed,
230
+ batchesSent: batchCount,
231
+ errors,
232
+ memoryUsage: process.memoryUsage().heapUsed / 1024 / 1024 + ' MB',
233
+ });
234
+ }
235
+ }
236
+
237
+ // Send remaining records
238
+ if (currentBatch.length > 0) {
239
+ await client.sendBatch(job.id, {
240
+ entities: currentBatch,
241
+ });
242
+ batchCount++;
243
+ }
244
+
245
+ logger.info('Streaming ingestion complete', {
246
+ totalRecords: recordsProcessed,
247
+ batchesSent: batchCount,
248
+ errors,
249
+ jobId: job.id,
250
+ });
251
+
252
+ return {
253
+ success: true,
254
+ jobId: job.id,
255
+ recordsProcessed,
256
+ batchesSent: batchCount,
257
+ errors,
258
+ };
259
+ }
260
+ ```
261
+
262
+ **Memory Profile:**
263
+
264
+ ```
265
+ File Size: 2GB (5M records)
266
+ RAM Usage: ~50MB peak (1000 record batches)
267
+ Processing Time: ~45 minutes (sequential)
268
+ ```
269
+
270
+ ---
271
+
272
+ ## Pattern 2: File Chunking (Split & Track)
273
+
274
+ **Best for:** 1M-5M records, need checkpoint/resume, want progress visibility
275
+
276
+ **Strategy:**
277
+
278
+ 1. Split large file into 100K record chunks
279
+ 2. Write chunks to temp S3 locations
280
+ 3. Track chunk metadata in VersoriKV
281
+ 4. Process chunks sequentially (can resume on failure)
282
+
283
+ ### Implementation
284
+
285
+ ```typescript
286
+ import {
287
+ createClient,
288
+ CSVParserService,
289
+ S3DataSource,
290
+ UniversalMapper,
291
+ StateService,
292
+ VersoriKVAdapter,
293
+ createConsoleLogger,
294
+ toStructuredLogger
295
+ } from '@fluentcommerce/fc-connect-sdk';
296
+
297
+ const logger = createConsoleLogger();
298
+
299
+ interface ChunkMetadata {
300
+ chunkId: string;
301
+ startRecord: number;
302
+ endRecord: number;
303
+ s3Key: string;
304
+ recordCount: number;
305
+ status: 'pending' | 'processing' | 'completed' | 'failed';
306
+ processedAt?: string;
307
+ error?: string;
308
+ }
309
+
310
+ async function chunkedIngestion(ctx: any) {
311
+ logger.info('Starting chunked ingestion');
312
+
313
+ // Initialize services
314
+ const client = await createClient(ctx);
315
+
316
+ const s3 = new S3DataSource(
317
+ {
318
+ type: 'S3_CSV',
319
+ connectionId: 'my-s3-chunked',
320
+ name: 'Inventory Files S3 Chunked',
321
+ s3Config: {
322
+ bucket: 'inventory-files',
323
+ region: 'us-east-1',
324
+ accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
325
+ secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
326
+ },
327
+ },
328
+ logger
329
+ );
330
+
331
+ // Initialize state management
332
+ const kv = context.openKv();
333
+ const kvAdapter = new VersoriKVAdapter(kv);
334
+ const stateService = new StateService(logger);
335
+
336
+ const SOURCE_FILE = 'inventory/huge-inventory.csv';
337
+ const CHUNK_SIZE = 100000; // 100K records per chunk
338
+ const workflowId = 'chunked-ingestion';
339
+
340
+ // STEP 1: Check if chunking is already in progress
341
+ const existingState = await stateService.getSyncState(kvAdapter, workflowId);
342
+
343
+ if (existingState.isInitialized && existingState.lastSyncResult === 'partial') {
344
+ logger.info('Resuming from previous run', {
345
+ lastProcessedFile: existingState.lastProcessedFile,
346
+ lastProcessedCount: existingState.lastProcessedCount,
347
+ });
348
+ }
349
+
350
+ // STEP 2: Split file into chunks
351
+ logger.info('Splitting file into chunks', {
352
+ sourceFile: SOURCE_FILE,
353
+ chunkSize: CHUNK_SIZE,
354
+ });
355
+
356
+ const chunks = await splitFileIntoChunks(s3, SOURCE_FILE, CHUNK_SIZE, workflowId, kvAdapter);
357
+
358
+ logger.info('File split complete', {
359
+ totalChunks: chunks.length,
360
+ totalRecords: chunks.reduce((sum, c) => sum + c.recordCount, 0),
361
+ });
362
+
363
+ // STEP 3: Create job for ingestion
364
+ const job = await client.createJob({
365
+ name: `chunked-inventory-ingestion-${Date.now()}`,
366
+ retailerId: client.getRetailerId() || ctx.connections?.fluent_commerce?.retailerId,
367
+ });
368
+
369
+ logger.info('Job created', { jobId: job.id });
370
+
371
+ // STEP 4: Process each chunk sequentially
372
+ let successCount = 0;
373
+ let failureCount = 0;
374
+
375
+ for (const chunk of chunks) {
376
+ try {
377
+ // Skip if already processed
378
+ const chunkState = await kvAdapter.get(['chunk', workflowId, chunk.chunkId, 'status']);
379
+
380
+ if (chunkState?.value === 'completed') {
381
+ logger.info('Chunk already processed, skipping', {
382
+ chunkId: chunk.chunkId,
383
+ });
384
+ successCount++;
385
+ continue;
386
+ }
387
+
388
+ // Mark chunk as processing
389
+ await kvAdapter.set(['chunk', workflowId, chunk.chunkId, 'status'], 'processing');
390
+
391
+ logger.info('Processing chunk', {
392
+ chunkId: chunk.chunkId,
393
+ recordCount: chunk.recordCount,
394
+ progress: `${successCount + failureCount}/${chunks.length}`,
395
+ });
396
+
397
+ // Process chunk
398
+ await processChunk(s3, client, job.id, chunk);
399
+
400
+ // Mark chunk as completed
401
+ await kvAdapter.set(['chunk', workflowId, chunk.chunkId], {
402
+ ...chunk,
403
+ status: 'completed',
404
+ processedAt: new Date().toISOString(),
405
+ } as ChunkMetadata);
406
+
407
+ successCount++;
408
+
409
+ logger.info('Chunk completed', {
410
+ chunkId: chunk.chunkId,
411
+ successCount,
412
+ failureCount,
413
+ percentComplete: (((successCount + failureCount) / chunks.length) * 100).toFixed(1),
414
+ });
415
+ } catch (error) {
416
+ failureCount++;
417
+ logger.error('Chunk processing failed', error as Error, {
418
+ chunkId: chunk.chunkId,
419
+ });
420
+
421
+ // Mark chunk as failed
422
+ await kvAdapter.set(['chunk', workflowId, chunk.chunkId], {
423
+ ...chunk,
424
+ status: 'failed',
425
+ error: (error as Error).message,
426
+ } as ChunkMetadata);
427
+ }
428
+ }
429
+
430
+ // STEP 5: Update final state
431
+ await stateService.updateSyncState(
432
+ kvAdapter,
433
+ [
434
+ {
435
+ fileName: SOURCE_FILE,
436
+ lastModified: new Date().toISOString(),
437
+ recordCount: chunks.reduce((sum, c) => sum + c.recordCount, 0),
438
+ },
439
+ ],
440
+ workflowId
441
+ );
442
+
443
+ logger.info('Chunked ingestion complete', {
444
+ totalChunks: chunks.length,
445
+ successCount,
446
+ failureCount,
447
+ jobId: job.id,
448
+ });
449
+
450
+ return {
451
+ success: failureCount === 0,
452
+ jobId: job.id,
453
+ chunksProcessed: successCount,
454
+ chunksFailed: failureCount,
455
+ totalChunks: chunks.length,
456
+ };
457
+ }
458
+
459
+ /**
460
+ * Split file into chunks and upload to S3
461
+ */
462
+ async function splitFileIntoChunks(
463
+ s3: S3DataSource,
464
+ sourceKey: string,
465
+ chunkSize: number,
466
+ workflowId: string,
467
+ kv: VersoriKVAdapter
468
+ ): Promise<ChunkMetadata[]> {
469
+ const csvParser = new CSVParserService();
470
+ const chunks: ChunkMetadata[] = [];
471
+
472
+ // Download source file
473
+ const fileContent = (await s3.downloadFile(sourceKey, {
474
+ encoding: 'utf8',
475
+ })) as string;
476
+
477
+ let currentChunk: any[] = [];
478
+ let chunkNumber = 0;
479
+ let recordNumber = 0;
480
+
481
+ // Stream through file and create chunks
482
+ for await (const record of csvParser.parseStreaming(fileContent, {}, 1)) {
483
+ currentChunk.push(record);
484
+ recordNumber++;
485
+
486
+ // Create chunk when size reached
487
+ if (currentChunk.length >= chunkSize) {
488
+ const chunkId = `chunk-${chunkNumber.toString().padStart(5, '0')}`;
489
+ const chunkKey = `temp/${workflowId}/${chunkId}.csv`;
490
+
491
+ // Convert chunk to CSV
492
+ const chunkCSV = csvParser.stringify(currentChunk, { headers: true });
493
+
494
+ // Upload chunk to S3
495
+ await s3.uploadFile(chunkKey, chunkCSV, {
496
+ contentType: 'text/csv',
497
+ });
498
+
499
+ // Create chunk metadata
500
+ const metadata: ChunkMetadata = {
501
+ chunkId,
502
+ startRecord: recordNumber - currentChunk.length,
503
+ endRecord: recordNumber - 1,
504
+ s3Key: chunkKey,
505
+ recordCount: currentChunk.length,
506
+ status: 'pending',
507
+ };
508
+
509
+ chunks.push(metadata);
510
+
511
+ // Store chunk metadata in KV
512
+ await kv.set(['chunk', workflowId, chunkId], metadata);
513
+
514
+ logger.info('Chunk created', {
515
+ chunkId,
516
+ recordCount: currentChunk.length,
517
+ s3Key: chunkKey,
518
+ });
519
+
520
+ // Clear chunk (free memory)
521
+ currentChunk = [];
522
+ chunkNumber++;
523
+ }
524
+ }
525
+
526
+ // Handle remaining records
527
+ if (currentChunk.length > 0) {
528
+ const chunkId = `chunk-${chunkNumber.toString().padStart(5, '0')}`;
529
+ const chunkKey = `temp/${workflowId}/${chunkId}.csv`;
530
+
531
+ const chunkCSV = csvParser.stringify(currentChunk, { headers: true });
532
+ await s3.uploadFile(chunkKey, chunkCSV, { contentType: 'text/csv' });
533
+
534
+ const metadata: ChunkMetadata = {
535
+ chunkId,
536
+ startRecord: recordNumber - currentChunk.length,
537
+ endRecord: recordNumber - 1,
538
+ s3Key: chunkKey,
539
+ recordCount: currentChunk.length,
540
+ status: 'pending',
541
+ };
542
+
543
+ chunks.push(metadata);
544
+ await kv.set(['chunk', workflowId, chunkId], metadata);
545
+ }
546
+
547
+ return chunks;
548
+ }
549
+
550
+ /**
551
+ * Process a single chunk
552
+ */
553
+ async function processChunk(
554
+ s3: S3DataSource,
555
+ client: any,
556
+ jobId: string,
557
+ chunk: ChunkMetadata
558
+ ): Promise<void> {
559
+ const csvParser = new CSVParserService();
560
+ const mapper = new UniversalMapper({
561
+ fields: {
562
+ skuRef: { source: 'sku', required: true },
563
+ locationRef: { source: 'location_code', required: true },
564
+ qty: { source: 'quantity', resolver: 'sdk.parseInt' },
565
+ expectedOn: { source: 'expected_date', resolver: 'sdk.formatDate' },
566
+ },
567
+ });
568
+
569
+ // Download chunk
570
+ const chunkContent = (await s3.downloadFile(chunk.s3Key, {
571
+ encoding: 'utf8',
572
+ })) as string;
573
+
574
+ // Parse chunk
575
+ const records = await csvParser.parse(chunkContent);
576
+
577
+ // Map records
578
+ const entities: any[] = [];
579
+ for (const record of records) {
580
+ const mapped = await mapper.map(record);
581
+ if (mapped.success && mapped.data) {
582
+ entities.push(mapped.data);
583
+ }
584
+ }
585
+
586
+ // Send batch
587
+ await client.sendBatch(jobId, { entities });
588
+
589
+ logger.info('Chunk batch sent', {
590
+ chunkId: chunk.chunkId,
591
+ entityCount: entities.length,
592
+ });
593
+ }
594
+ ```
595
+
596
+ **VersoriKV Schema:**
597
+
598
+ ```typescript
599
+ // Chunk metadata
600
+ ['chunk', workflowId, chunkId] => ChunkMetadata
601
+
602
+ // Chunk status
603
+ ['chunk', workflowId, chunkId, 'status'] => 'pending' | 'processing' | 'completed' | 'failed'
604
+
605
+ // Workflow state
606
+ ['state', workflowId, 'sync'] => SyncState
607
+ ```
608
+
609
+ **Performance:**
610
+
611
+ ```
612
+ File Size: 5GB (10M records)
613
+ Chunk Size: 100K records
614
+ Total Chunks: 100
615
+ Processing Time: ~60 minutes (sequential)
616
+ RAM Usage: ~100MB (processes one chunk at a time)
617
+ ```
618
+
619
+ ---
620
+
621
+ ## Pattern 3: Parallel Processing (High Performance)
622
+
623
+ **Best for:** 5M-10M records, time-sensitive ingestion, need speed with reliability
624
+
625
+ **Strategy:**
626
+
627
+ 1. Split file into chunks (same as Pattern 2)
628
+ 2. Spawn 5 parallel Batch API jobs
629
+ 3. Process chunks concurrently
630
+ 4. Track progress in VersoriKV
631
+ 5. Resume on failure
632
+
633
+ ### Implementation
634
+
635
+ ```typescript
636
+ import {
637
+ createClient,
638
+ CSVParserService,
639
+ S3DataSource,
640
+ UniversalMapper,
641
+ StateService,
642
+ VersoriKVAdapter,
643
+ createConsoleLogger,
644
+ toStructuredLogger
645
+ } from '@fluentcommerce/fc-connect-sdk';
646
+
647
+ const logger = createConsoleLogger();
648
+
649
+ interface ParallelJob {
650
+ jobId: string;
651
+ assignedChunks: string[];
652
+ status: 'pending' | 'processing' | 'completed' | 'failed';
653
+ recordsProcessed: number;
654
+ startedAt?: string;
655
+ completedAt?: string;
656
+ }
657
+
658
+ async function parallelIngestion(ctx: any) {
659
+ logger.info('Starting parallel ingestion');
660
+
661
+ // Initialize services
662
+ const client = await createClient(ctx);
663
+
664
+ const s3 = new S3DataSource(
665
+ {
666
+ type: 'S3_CSV',
667
+ connectionId: 'my-s3-parallel',
668
+ name: 'Inventory Files S3 Parallel',
669
+ s3Config: {
670
+ bucket: 'inventory-files',
671
+ region: 'us-east-1',
672
+ accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
673
+ secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
674
+ },
675
+ },
676
+ logger
677
+ );
678
+
679
+ const kv = context.openKv();
680
+ const kvAdapter = new VersoriKVAdapter(kv);
681
+ const stateService = new StateService(logger);
682
+
683
+ const SOURCE_FILE = 'inventory/huge-inventory.csv';
684
+ const CHUNK_SIZE = 100000; // 100K records per chunk
685
+ const PARALLEL_JOBS = 5; // Process 5 chunks concurrently
686
+ const workflowId = 'parallel-ingestion';
687
+
688
+ // STEP 1: Split file into chunks (reuse from Pattern 2)
689
+ const chunks = await splitFileIntoChunks(s3, SOURCE_FILE, CHUNK_SIZE, workflowId, kvAdapter);
690
+
691
+ logger.info('File split complete', {
692
+ totalChunks: chunks.length,
693
+ totalRecords: chunks.reduce((sum, c) => sum + c.recordCount, 0),
694
+ });
695
+
696
+ // STEP 2: Create multiple jobs for parallel processing
697
+ const jobs: ParallelJob[] = [];
698
+
699
+ for (let i = 0; i < PARALLEL_JOBS; i++) {
700
+ const job = await client.createJob({
701
+ name: `parallel-inventory-ingestion-job-${i + 1}`,
702
+ retailerId: client.getRetailerId() || ctx.connections?.fluent_commerce?.retailerId,
703
+ });
704
+
705
+ jobs.push({
706
+ jobId: job.id,
707
+ assignedChunks: [],
708
+ status: 'pending',
709
+ recordsProcessed: 0,
710
+ });
711
+
712
+ logger.info('Parallel job created', {
713
+ jobNumber: i + 1,
714
+ jobId: job.id,
715
+ });
716
+ }
717
+
718
+ // STEP 3: Distribute chunks across jobs (round-robin)
719
+ chunks.forEach((chunk, index) => {
720
+ const jobIndex = index % PARALLEL_JOBS;
721
+ jobs[jobIndex].assignedChunks.push(chunk.chunkId);
722
+ });
723
+
724
+ logger.info('Chunks distributed', {
725
+ totalChunks: chunks.length,
726
+ jobCount: PARALLEL_JOBS,
727
+ chunksPerJob: jobs.map(j => j.assignedChunks.length),
728
+ });
729
+
730
+ // STEP 4: Process chunks in parallel
731
+ const startTime = Date.now();
732
+
733
+ const jobPromises = jobs.map((job, jobIndex) =>
734
+ processJobChunks(
735
+ s3,
736
+ client,
737
+ job,
738
+ chunks.filter(c => job.assignedChunks.includes(c.chunkId)),
739
+ workflowId,
740
+ kvAdapter,
741
+ jobIndex + 1
742
+ )
743
+ );
744
+
745
+ // Wait for all jobs to complete
746
+ const results = await Promise.allSettled(jobPromises);
747
+ const duration = (Date.now() - startTime) / 1000;
748
+
749
+ // STEP 5: Analyze results
750
+ let successfulJobs = 0;
751
+ let failedJobs = 0;
752
+ let totalRecordsProcessed = 0;
753
+
754
+ results.forEach((result, index) => {
755
+ if (result.status === 'fulfilled') {
756
+ successfulJobs++;
757
+ totalRecordsProcessed += result.value.recordsProcessed;
758
+
759
+ logger.info('Job completed', {
760
+ jobNumber: index + 1,
761
+ jobId: jobs[index].jobId,
762
+ recordsProcessed: result.value.recordsProcessed,
763
+ chunksProcessed: result.value.chunksProcessed,
764
+ });
765
+ } else {
766
+ failedJobs++;
767
+ logger.error('Job failed', result.reason, {
768
+ jobNumber: index + 1,
769
+ jobId: jobs[index].jobId,
770
+ });
771
+ }
772
+ });
773
+
774
+ // STEP 6: Update final state
775
+ await stateService.updateSyncState(
776
+ kvAdapter,
777
+ [
778
+ {
779
+ fileName: SOURCE_FILE,
780
+ lastModified: new Date().toISOString(),
781
+ recordCount: totalRecordsProcessed,
782
+ },
783
+ ],
784
+ workflowId
785
+ );
786
+
787
+ logger.info('Parallel ingestion complete', {
788
+ totalChunks: chunks.length,
789
+ parallelJobs: PARALLEL_JOBS,
790
+ successfulJobs,
791
+ failedJobs,
792
+ totalRecordsProcessed,
793
+ durationSeconds: duration,
794
+ recordsPerSecond: Math.round(totalRecordsProcessed / duration),
795
+ });
796
+
797
+ return {
798
+ success: failedJobs === 0,
799
+ totalChunks: chunks.length,
800
+ totalRecordsProcessed,
801
+ successfulJobs,
802
+ failedJobs,
803
+ durationSeconds: duration,
804
+ recordsPerSecond: Math.round(totalRecordsProcessed / duration),
805
+ };
806
+ }
807
+
808
+ /**
809
+ * Process all chunks assigned to a job
810
+ */
811
+ async function processJobChunks(
812
+ s3: S3DataSource,
813
+ client: any,
814
+ job: ParallelJob,
815
+ chunks: ChunkMetadata[],
816
+ workflowId: string,
817
+ kv: VersoriKVAdapter,
818
+ jobNumber: number
819
+ ): Promise<{ recordsProcessed: number; chunksProcessed: number }> {
820
+ logger.info(`Job ${jobNumber} starting`, {
821
+ jobId: job.jobId,
822
+ assignedChunks: chunks.length,
823
+ });
824
+
825
+ let recordsProcessed = 0;
826
+ let chunksProcessed = 0;
827
+
828
+ for (const chunk of chunks) {
829
+ try {
830
+ // Check if chunk already processed
831
+ const chunkState = await kv.get(['chunk', workflowId, chunk.chunkId, 'status']);
832
+
833
+ if (chunkState?.value === 'completed') {
834
+ logger.info(`Job ${jobNumber}: Chunk already processed`, {
835
+ chunkId: chunk.chunkId,
836
+ });
837
+ chunksProcessed++;
838
+ continue;
839
+ }
840
+
841
+ // Mark chunk as processing
842
+ await kv.set(['chunk', workflowId, chunk.chunkId, 'status'], 'processing');
843
+
844
+ logger.info(`Job ${jobNumber}: Processing chunk`, {
845
+ chunkId: chunk.chunkId,
846
+ recordCount: chunk.recordCount,
847
+ progress: `${chunksProcessed}/${chunks.length}`,
848
+ });
849
+
850
+ // Process chunk
851
+ await processChunk(s3, client, job.jobId, chunk);
852
+
853
+ // Mark chunk as completed
854
+ await kv.set(['chunk', workflowId, chunk.chunkId], {
855
+ ...chunk,
856
+ status: 'completed',
857
+ processedAt: new Date().toISOString(),
858
+ } as ChunkMetadata);
859
+
860
+ recordsProcessed += chunk.recordCount;
861
+ chunksProcessed++;
862
+
863
+ logger.info(`Job ${jobNumber}: Chunk completed`, {
864
+ chunkId: chunk.chunkId,
865
+ recordsProcessed,
866
+ chunksProcessed,
867
+ percentComplete: ((chunksProcessed / chunks.length) * 100).toFixed(1),
868
+ });
869
+ } catch (error) {
870
+ logger.error(`Job ${jobNumber}: Chunk failed`, error as Error, {
871
+ chunkId: chunk.chunkId,
872
+ });
873
+
874
+ // Mark chunk as failed (don't throw - continue with remaining chunks)
875
+ await kv.set(['chunk', workflowId, chunk.chunkId], {
876
+ ...chunk,
877
+ status: 'failed',
878
+ error: (error as Error).message,
879
+ } as ChunkMetadata);
880
+ }
881
+ }
882
+
883
+ logger.info(`Job ${jobNumber} completed`, {
884
+ jobId: job.jobId,
885
+ recordsProcessed,
886
+ chunksProcessed,
887
+ });
888
+
889
+ return { recordsProcessed, chunksProcessed };
890
+ }
891
+ ```
892
+
893
+ **Progress Tracking:**
894
+
895
+ ```typescript
896
+ // Real-time progress query
897
+ async function getIngestionProgress(
898
+ workflowId: string,
899
+ kv: VersoriKVAdapter
900
+ ): Promise<{
901
+ totalChunks: number;
902
+ completedChunks: number;
903
+ failedChunks: number;
904
+ processingChunks: number;
905
+ percentComplete: number;
906
+ }> {
907
+ // This would query all chunk statuses from KV
908
+ // Simplified example:
909
+ const chunks = await getAllChunkMetadata(workflowId, kv);
910
+
911
+ const completed = chunks.filter(c => c.status === 'completed').length;
912
+ const failed = chunks.filter(c => c.status === 'failed').length;
913
+ const processing = chunks.filter(c => c.status === 'processing').length;
914
+
915
+ return {
916
+ totalChunks: chunks.length,
917
+ completedChunks: completed,
918
+ failedChunks: failed,
919
+ processingChunks: processing,
920
+ percentComplete: (completed / chunks.length) * 100,
921
+ };
922
+ }
923
+ ```
924
+
925
+ **Performance:**
926
+
927
+ ```
928
+ File Size: 5GB (10M records)
929
+ Chunk Size: 100K records
930
+ Total Chunks: 100
931
+ Parallel Jobs: 5
932
+ Processing Time: ~15 minutes (4x speedup)
933
+ RAM Usage: ~500MB (5 chunks in parallel)
934
+ Throughput: ~11,111 records/second
935
+ ```
936
+
937
+ ---
938
+
939
+ ## Pattern 4: Distributed Processing (Versori Workflows)
940
+
941
+ **Best for:** 10M+ records, enterprise scale, need maximum reliability and observability
942
+
943
+ **Strategy:**
944
+
945
+ 1. Coordinator workflow splits file and creates scheduled tasks
946
+ 2. Each worker workflow processes one chunk
947
+ 3. Coordinator tracks completion via VersoriKV
948
+ 4. Automatic retry on worker failure
949
+
950
+ ### Coordinator Workflow
951
+
952
+ ```typescript
953
+ import { fn, schedule } from '@versori/run';
954
+ import {
955
+ createClient,
956
+ S3DataSource,
957
+ VersoriKVAdapter,
958
+ createConsoleLogger,
959
+ toStructuredLogger
960
+ } from '@fluentcommerce/fc-connect-sdk';
961
+
962
+ const logger = createConsoleLogger();
963
+
964
+ /**
965
+ * Coordinator workflow - splits file and spawns workers
966
+ */
967
+ export const coordinatorWorkflow = schedule('coordinator')
968
+ .cron('0 2 * * *') // Run daily at 2 AM
969
+ .then(
970
+ fn('split-and-schedule', async ({ activation, connections, kv }) => {
971
+ logger.info('Coordinator: Starting distributed ingestion');
972
+
973
+ const s3 = new S3DataSource(
974
+ {
975
+ type: 'S3_CSV',
976
+ connectionId: 'my-s3-3',
977
+ name: 'Inventory Files S3 3',
978
+ s3Config: {
979
+ bucket: 'inventory-files',
980
+ region: 'us-east-1',
981
+ accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
982
+ secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
983
+ },
984
+ },
985
+ logger
986
+ );
987
+
988
+ const kvAdapter = new VersoriKVAdapter(kv);
989
+ const workflowId = `distributed-${Date.now()}`;
990
+ const SOURCE_FILE = 'inventory/enterprise-inventory.csv';
991
+ const CHUNK_SIZE = 100000;
992
+
993
+ // Split file into chunks
994
+ const chunks = await splitFileIntoChunks(s3, SOURCE_FILE, CHUNK_SIZE, workflowId, kvAdapter);
995
+
996
+ logger.info('Coordinator: File split complete', {
997
+ totalChunks: chunks.length,
998
+ workflowId,
999
+ });
1000
+
1001
+ // Store coordinator state
1002
+ await kvAdapter.set(['coordinator', workflowId], {
1003
+ workflowId,
1004
+ sourceFile: SOURCE_FILE,
1005
+ totalChunks: chunks.length,
1006
+ status: 'scheduled',
1007
+ createdAt: new Date().toISOString(),
1008
+ });
1009
+
1010
+ // Schedule worker for each chunk
1011
+ for (const chunk of chunks) {
1012
+ // Trigger worker workflow (Versori will handle scheduling)
1013
+ await activation.triggerWorkflow('chunk-worker', {
1014
+ workflowId,
1015
+ chunkId: chunk.chunkId,
1016
+ chunkKey: chunk.s3Key,
1017
+ recordCount: chunk.recordCount,
1018
+ });
1019
+
1020
+ logger.info('Coordinator: Worker scheduled', {
1021
+ chunkId: chunk.chunkId,
1022
+ workflowId,
1023
+ });
1024
+ }
1025
+
1026
+ return {
1027
+ workflowId,
1028
+ totalChunks: chunks.length,
1029
+ message: `Scheduled ${chunks.length} worker workflows`,
1030
+ };
1031
+ })
1032
+ );
1033
+
1034
+ /**
1035
+ * Monitor workflow - checks completion status
1036
+ */
1037
+ export const monitorWorkflow = schedule('monitor')
1038
+ .cron('*/5 * * * *') // Run every 5 minutes
1039
+ .then(
1040
+ fn('check-progress', async ({ kv }) => {
1041
+ const kvAdapter = new VersoriKVAdapter(kv);
1042
+
1043
+ // Get all active coordinators
1044
+ const coordinators = await getActiveCoordinators(kvAdapter);
1045
+
1046
+ for (const coordinator of coordinators) {
1047
+ const progress = await getIngestionProgress(coordinator.workflowId, kvAdapter);
1048
+
1049
+ logger.info('Monitor: Progress update', {
1050
+ workflowId: coordinator.workflowId,
1051
+ ...progress,
1052
+ });
1053
+
1054
+ // Check if complete
1055
+ if (progress.completedChunks + progress.failedChunks === progress.totalChunks) {
1056
+ // Mark coordinator as complete
1057
+ await kvAdapter.set(['coordinator', coordinator.workflowId], {
1058
+ ...coordinator,
1059
+ status: 'completed',
1060
+ completedAt: new Date().toISOString(),
1061
+ progress,
1062
+ });
1063
+
1064
+ logger.info('Monitor: Ingestion complete', {
1065
+ workflowId: coordinator.workflowId,
1066
+ ...progress,
1067
+ });
1068
+ }
1069
+ }
1070
+
1071
+ return { coordinatorsChecked: coordinators.length };
1072
+ })
1073
+ );
1074
+ ```
1075
+
1076
+ ### Worker Workflow
1077
+
1078
+ ```typescript
1079
+ import { fn, webhook } from '@versori/run';
1080
+ import {
1081
+ createClient,
1082
+ S3DataSource,
1083
+ CSVParserService,
1084
+ UniversalMapper,
1085
+ VersoriKVAdapter,
1086
+ createConsoleLogger,
1087
+ toStructuredLogger
1088
+ } from '@fluentcommerce/fc-connect-sdk';
1089
+
1090
+ const logger = createConsoleLogger();
1091
+
1092
+ /**
1093
+ * Worker workflow - processes a single chunk
1094
+ */
1095
+ export const chunkWorker = webhook('chunk-worker').then(
1096
+ fn('process-chunk', async ({ data, activation, connections, kv }) => {
1097
+ const { workflowId, chunkId, chunkKey, recordCount } = data;
1098
+
1099
+ logger.info('Worker: Starting chunk processing', {
1100
+ workflowId,
1101
+ chunkId,
1102
+ recordCount,
1103
+ });
1104
+
1105
+ const kvAdapter = new VersoriKVAdapter(kv);
1106
+
1107
+ // Check if already processed
1108
+ const chunkState = await kvAdapter.get(['chunk', workflowId, chunkId, 'status']);
1109
+
1110
+ if (chunkState?.value === 'completed') {
1111
+ logger.info('Worker: Chunk already processed', { chunkId });
1112
+ return { chunkId, status: 'skipped', message: 'Already processed' };
1113
+ }
1114
+
1115
+ // Mark as processing
1116
+ await kvAdapter.set(['chunk', workflowId, chunkId, 'status'], 'processing');
1117
+
1118
+ try {
1119
+ // Initialize services
1120
+ const client = await createClient(ctx);
1121
+
1122
+ const s3 = new S3DataSource(
1123
+ {
1124
+ type: 'S3_CSV',
1125
+ connectionId: 'my-s3-4',
1126
+ name: 'Inventory Files S3 4',
1127
+ s3Config: {
1128
+ bucket: 'inventory-files',
1129
+ region: 'us-east-1',
1130
+ accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
1131
+ secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
1132
+ },
1133
+ },
1134
+ logger
1135
+ );
1136
+
1137
+ const csvParser = new CSVParserService();
1138
+ const mapper = new UniversalMapper({
1139
+ fields: {
1140
+ skuRef: { source: 'sku', required: true },
1141
+ locationRef: { source: 'location_code', required: true },
1142
+ qty: { source: 'quantity', resolver: 'sdk.parseInt' },
1143
+ expectedOn: { source: 'expected_date', resolver: 'sdk.formatDate' },
1144
+ },
1145
+ });
1146
+
1147
+ // Get or create job for this workflow
1148
+ let jobId = await kvAdapter.get(['job', workflowId, 'jobId']);
1149
+
1150
+ if (!jobId?.value) {
1151
+ const job = await client.createJob({
1152
+ name: `distributed-ingestion-${workflowId}`,
1153
+ retailerId: client.getRetailerId() || ctx.connections?.fluent_commerce?.retailerId,
1154
+ });
1155
+
1156
+ await kvAdapter.set(['job', workflowId, 'jobId'], job.id);
1157
+ jobId = { value: job.id };
1158
+ }
1159
+
1160
+ // Download chunk
1161
+ const chunkContent = (await s3.downloadFile(chunkKey, {
1162
+ encoding: 'utf8',
1163
+ })) as string;
1164
+
1165
+ // Parse chunk
1166
+ const records = await csvParser.parse(chunkContent);
1167
+
1168
+ // Map records
1169
+ const entities: any[] = [];
1170
+ for (const record of records) {
1171
+ const mapped = await mapper.map(record);
1172
+ if (mapped.success && mapped.data) {
1173
+ entities.push(mapped.data);
1174
+ }
1175
+ }
1176
+
1177
+ // Send batch
1178
+ await client.sendBatch(jobId.value as string, { entities });
1179
+
1180
+ // Mark as completed
1181
+ await kvAdapter.set(['chunk', workflowId, chunkId], {
1182
+ chunkId,
1183
+ s3Key: chunkKey,
1184
+ recordCount: entities.length,
1185
+ status: 'completed',
1186
+ processedAt: new Date().toISOString(),
1187
+ });
1188
+
1189
+ logger.info('Worker: Chunk completed', {
1190
+ workflowId,
1191
+ chunkId,
1192
+ recordCount: entities.length,
1193
+ });
1194
+
1195
+ return {
1196
+ chunkId,
1197
+ status: 'completed',
1198
+ recordsProcessed: entities.length,
1199
+ };
1200
+ } catch (error) {
1201
+ logger.error('Worker: Chunk failed', error as Error, {
1202
+ workflowId,
1203
+ chunkId,
1204
+ });
1205
+
1206
+ // Mark as failed
1207
+ await kvAdapter.set(['chunk', workflowId, chunkId], {
1208
+ chunkId,
1209
+ s3Key: chunkKey,
1210
+ recordCount,
1211
+ status: 'failed',
1212
+ error: (error as Error).message,
1213
+ });
1214
+
1215
+ throw error;
1216
+ }
1217
+ })
1218
+ );
1219
+ ```
1220
+
1221
+ **Performance:**
1222
+
1223
+ ```
1224
+ File Size: 10GB (20M records)
1225
+ Chunk Size: 100K records
1226
+ Total Chunks: 200
1227
+ Worker Workflows: 200 (parallel)
1228
+ Processing Time: ~10 minutes (Versori handles parallelism)
1229
+ RAM Usage: ~50MB per worker
1230
+ Throughput: ~33,333 records/second
1231
+ ```
1232
+
1233
+ ---
1234
+
1235
+ ## Memory Optimization Tips
1236
+
1237
+ ### 1. Use Streaming APIs
1238
+
1239
+ ```typescript
1240
+ // ❌ WRONG - Loads entire file into memory
1241
+ const fileContent = await fs.readFile('huge.csv', 'utf-8');
1242
+ const records = await csvParser.parse(fileContent);
1243
+
1244
+ // ✅ CORRECT - Streams records incrementally
1245
+ for await (const record of csvParser.parseStreaming(fileContent, {}, 1)) {
1246
+ await processRecord(record);
1247
+ }
1248
+ ```
1249
+
1250
+ ### 2. Clear Batches After Processing
1251
+
1252
+ ```typescript
1253
+ let batch: any[] = [];
1254
+ for await (const record of records) {
1255
+ batch.push(record);
1256
+
1257
+ if (batch.length >= 1000) {
1258
+ await sendBatch(batch);
1259
+ batch = []; // ✅ Clear batch to free memory
1260
+ }
1261
+ }
1262
+ ```
1263
+
1264
+ ### 3. Monitor Memory Usage
1265
+
1266
+ ```typescript
1267
+ function logMemoryUsage() {
1268
+ const used = process.memoryUsage();
1269
+ console.log({
1270
+ heapUsed: Math.round(used.heapUsed / 1024 / 1024) + ' MB',
1271
+ heapTotal: Math.round(used.heapTotal / 1024 / 1024) + ' MB',
1272
+ rss: Math.round(used.rss / 1024 / 1024) + ' MB',
1273
+ });
1274
+ }
1275
+
1276
+ // Log every 10K records
1277
+ if (recordsProcessed % 10000 === 0) {
1278
+ logMemoryUsage();
1279
+ }
1280
+ ```
1281
+
1282
+ ### 4. Use Garbage Collection Hints
1283
+
1284
+ ```typescript
1285
+ // Force garbage collection (requires --expose-gc flag)
1286
+ if (recordsProcessed % 100000 === 0 && global.gc) {
1287
+ global.gc();
1288
+ logger.info('Garbage collection triggered', { recordsProcessed });
1289
+ }
1290
+ ```
1291
+
1292
+ ---
1293
+
1294
+ ## Performance Benchmarks
1295
+
1296
+ ### Pattern Comparison (10M records, 5GB file)
1297
+
1298
+ | Pattern | Time | RAM | Throughput | Complexity |
1299
+ | ------------------------- | ------ | ------ | -------------- | ---------- |
1300
+ | 1. Basic Streaming | 90 min | 50MB | 1,852 rec/sec | Low |
1301
+ | 2. File Chunking | 60 min | 100MB | 2,778 rec/sec | Medium |
1302
+ | 3. Parallel Processing | 15 min | 500MB | 11,111 rec/sec | High |
1303
+ | 4. Distributed Processing | 10 min | 50MB\* | 16,667 rec/sec | Very High |
1304
+
1305
+ \*Per worker; total RAM = 50MB × worker count
1306
+
1307
+ ### Optimization Impact
1308
+
1309
+ | Optimization | Before | After | Improvement |
1310
+ | ------------------------- | ------- | -------- | ----------- |
1311
+ | Streaming vs Loading | 5GB RAM | 50MB RAM | 100x |
1312
+ | Batching (1K vs 10K) | 90 min | 60 min | 1.5x |
1313
+ | Parallel (1 vs 5 jobs) | 60 min | 15 min | 4x |
1314
+ | Distributed (200 workers) | 15 min | 10 min | 1.5x |
1315
+
1316
+ ---
1317
+
1318
+ ## Common Issues & Solutions
1319
+
1320
+ ### Issue 1: Out of Memory
1321
+
1322
+ **Symptoms:**
1323
+
1324
+ ```
1325
+ FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
1326
+ ```
1327
+
1328
+ **Solutions:**
1329
+
1330
+ 1. Switch to streaming pattern (Pattern 1)
1331
+ 2. Reduce batch size (1000 => 500)
1332
+ 3. Increase Node.js heap: `node --max-old-space-size=4096`
1333
+ 4. Use file chunking (Pattern 2)
1334
+
1335
+ ### Issue 2: Timeout on Large Files
1336
+
1337
+ **Symptoms:**
1338
+
1339
+ ```
1340
+ TimeoutError: Operation timed out after 300000ms
1341
+ ```
1342
+
1343
+ **Solutions:**
1344
+
1345
+ 1. Increase timeout: `config.timeout = 600000` (10 min)
1346
+ 2. Split file into chunks (Pattern 2)
1347
+ 3. Use parallel processing (Pattern 3)
1348
+
1349
+ ### Issue 3: Chunks Not Resuming
1350
+
1351
+ **Symptoms:**
1352
+
1353
+ - Re-processing same chunks on failure
1354
+
1355
+ **Solutions:**
1356
+
1357
+ ```typescript
1358
+ // Check chunk status before processing
1359
+ const chunkState = await kv.get(['chunk', workflowId, chunkId, 'status']);
1360
+ if (chunkState?.value === 'completed') {
1361
+ logger.info('Chunk already processed, skipping', { chunkId });
1362
+ continue;
1363
+ }
1364
+ ```
1365
+
1366
+ ### Issue 4: Progress Tracking Inconsistent
1367
+
1368
+ **Symptoms:**
1369
+
1370
+ - Progress percentage doesn't match reality
1371
+
1372
+ **Solutions:**
1373
+
1374
+ ```typescript
1375
+ // Always update chunk status atomically
1376
+ const atomic = kv.atomic();
1377
+ atomic.set(['chunk', workflowId, chunkId, 'status'], 'completed');
1378
+ atomic.set(['chunk', workflowId, chunkId, 'processedAt'], new Date().toISOString());
1379
+ await atomic.commit();
1380
+ ```
1381
+
1382
+ ### Issue 5: Duplicate Processing
1383
+
1384
+ **Symptoms:**
1385
+
1386
+ - Same records sent multiple times
1387
+
1388
+ **Solutions:**
1389
+
1390
+ ```typescript
1391
+ // Use idempotency keys in Fluent batch payload
1392
+ await client.sendBatch(jobId, {
1393
+ entities,
1394
+ meta: {
1395
+ chunkId: chunk.chunkId,
1396
+ workflowId,
1397
+ idempotencyKey: `${workflowId}-${chunk.chunkId}`,
1398
+ },
1399
+ });
1400
+ ```
1401
+
1402
+ ---
1403
+
1404
+ ## Related Guides
1405
+
1406
+ - [Basic Ingestion Pattern](../standalone/s3-csv-batch-api.md) - For small files (<100K records)
1407
+ - [Streaming Pattern](../../02-CORE-GUIDES/ingestion/ingestion-readme.md) - For medium files (100K-1M records)
1408
+ - [Error Handling & Retry](./error-handling-retry.md) - Robust error handling strategies
1409
+ - [Progress Tracking](../../02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-08-performance-optimization.md) - Real-time progress monitoring
1410
+ - [State Management](../../02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-07-state-management.md) - VersoriKV patterns
1411
+
1412
+ ---
1413
+
1414
+ ## Summary
1415
+
1416
+ **Choose Your Pattern:**
1417
+
1418
+ - **Pattern 1 (Streaming)**: Simple, memory-efficient, suitable for 100K-1M records
1419
+ - **Pattern 2 (Chunking)**: Checkpoint/resume, suitable for 1M-5M records
1420
+ - **Pattern 3 (Parallel)**: High performance, suitable for 5M-10M records
1421
+ - **Pattern 4 (Distributed)**: Enterprise scale, suitable for 10M+ records
1422
+
1423
+ **Key Takeaways:**
1424
+
1425
+ 1. Always use streaming APIs for large files
1426
+ 2. Clear batches after processing to free memory
1427
+ 3. Use chunks + VersoriKV for checkpoint/resume
1428
+ 4. Parallel processing trades RAM for speed
1429
+ 5. Monitor memory usage throughout processing
1430
+ 6. Test with representative file sizes before production