@fluentcommerce/fc-connect-sdk 0.1.54 → 0.1.55
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +12 -0
- package/dist/cjs/clients/fluent-client.js +13 -6
- package/dist/cjs/utils/pagination-helpers.js +38 -2
- package/dist/cjs/versori/fluent-versori-client.js +11 -5
- package/dist/esm/clients/fluent-client.js +13 -6
- package/dist/esm/utils/pagination-helpers.js +38 -2
- package/dist/esm/versori/fluent-versori-client.js +11 -5
- package/dist/tsconfig.esm.tsbuildinfo +1 -1
- package/dist/tsconfig.tsbuildinfo +1 -1
- package/dist/tsconfig.types.tsbuildinfo +1 -1
- package/docs/00-START-HERE/EXPORT-VALIDATION.md +158 -158
- package/docs/00-START-HERE/cli-analyze-source-structure-guide.md +655 -655
- package/docs/00-START-HERE/cli-documentation-index.md +202 -202
- package/docs/00-START-HERE/cli-quick-reference.md +252 -252
- package/docs/00-START-HERE/decision-tree.md +552 -552
- package/docs/00-START-HERE/getting-started.md +1070 -1070
- package/docs/00-START-HERE/mapper-quick-decision-guide.md +235 -235
- package/docs/00-START-HERE/readme.md +237 -237
- package/docs/00-START-HERE/retailerid-configuration.md +404 -404
- package/docs/00-START-HERE/sdk-philosophy.md +794 -794
- package/docs/00-START-HERE/troubleshooting-quick-reference.md +1086 -1086
- package/docs/01-TEMPLATES/faq.md +686 -686
- package/docs/01-TEMPLATES/patterns/pattern-templates-guide.md +68 -68
- package/docs/01-TEMPLATES/patterns/patterns-csv-schema-validation-and-rejection-report.md +233 -233
- package/docs/01-TEMPLATES/patterns/patterns-custom-resolvers.md +407 -407
- package/docs/01-TEMPLATES/patterns/patterns-error-handling-retry.md +511 -511
- package/docs/01-TEMPLATES/patterns/patterns-field-mapping-universal.md +701 -701
- package/docs/01-TEMPLATES/patterns/patterns-large-file-splitting.md +1430 -1430
- package/docs/01-TEMPLATES/patterns/patterns-master-data-etl.md +2399 -2399
- package/docs/01-TEMPLATES/patterns/patterns-pagination-streaming.md +447 -447
- package/docs/01-TEMPLATES/patterns/patterns-state-duplicate-prevention.md +385 -385
- package/docs/01-TEMPLATES/readme.md +957 -957
- package/docs/01-TEMPLATES/standalone/standalone-asn-inbound-processing.md +1209 -1209
- package/docs/01-TEMPLATES/standalone/standalone-graphql-query-export.md +1140 -1140
- package/docs/01-TEMPLATES/standalone/standalone-graphql-to-parquet-partitioned-s3.md +432 -432
- package/docs/01-TEMPLATES/standalone/standalone-multi-channel-inventory-sync.md +1185 -1185
- package/docs/01-TEMPLATES/standalone/standalone-multi-source-aggregation.md +1462 -1462
- package/docs/01-TEMPLATES/standalone/standalone-s3-csv-batch-api.md +1390 -1390
- package/docs/01-TEMPLATES/standalone/standalone-s3-csv-inventory-to-batch.md +330 -330
- package/docs/01-TEMPLATES/standalone/standalone-scripts-guide.md +87 -87
- package/docs/01-TEMPLATES/standalone/standalone-sftp-xml-graphql.md +1444 -1444
- package/docs/01-TEMPLATES/standalone/standalone-webhook-payload-processing.md +688 -688
- package/docs/01-TEMPLATES/versori/business-examples/business-examples-dropship-order-routing.md +193 -193
- package/docs/01-TEMPLATES/versori/business-examples/business-examples-graphql-parquet-extraction.md +518 -518
- package/docs/01-TEMPLATES/versori/business-examples/business-examples-inter-location-transfers.md +2162 -2162
- package/docs/01-TEMPLATES/versori/business-examples/business-examples-pre-order-allocation.md +2226 -2226
- package/docs/01-TEMPLATES/versori/business-examples/business-scenarios-guide.md +87 -87
- package/docs/01-TEMPLATES/versori/patterns/versori-patterns-connection-validation-pattern.md +656 -656
- package/docs/01-TEMPLATES/versori/patterns/versori-patterns-dual-workflow-connector.md +835 -835
- package/docs/01-TEMPLATES/versori/patterns/versori-patterns-guide.md +108 -108
- package/docs/01-TEMPLATES/versori/patterns/versori-patterns-kv-state-management.md +1533 -1533
- package/docs/01-TEMPLATES/versori/patterns/versori-patterns-xml-response-patterns.md +1160 -1160
- package/docs/01-TEMPLATES/versori/versori-platform-guide.md +201 -201
- package/docs/01-TEMPLATES/versori/webhooks/template-webhook-asn-purchase-order.md +1906 -1906
- package/docs/01-TEMPLATES/versori/webhooks/template-webhook-dropship-routing.md +1074 -1074
- package/docs/01-TEMPLATES/versori/webhooks/template-webhook-flash-sale-reserve.md +1395 -1395
- package/docs/01-TEMPLATES/versori/webhooks/template-webhook-generic-xml-order.md +888 -888
- package/docs/01-TEMPLATES/versori/webhooks/template-webhook-payment-gateway-integration.md +2478 -2478
- package/docs/01-TEMPLATES/versori/webhooks/template-webhook-rma-returns-comprehensive.md +2240 -2240
- package/docs/01-TEMPLATES/versori/webhooks/template-webhook-xml-order-ingestion.md +2029 -2029
- package/docs/01-TEMPLATES/versori/webhooks/webhook-templates-guide.md +140 -140
- package/docs/01-TEMPLATES/versori/workflows/_examples/sample-data/inventory-mapping.json +20 -20
- package/docs/01-TEMPLATES/versori/workflows/_examples/sample-data/products_2025-01-22.csv +11 -11
- package/docs/01-TEMPLATES/versori/workflows/_examples/sample-data/sample-data-guide.md +34 -34
- package/docs/01-TEMPLATES/versori/workflows/_examples/workflow-examples-guide.md +36 -36
- package/docs/01-TEMPLATES/versori/workflows/extraction/extraction-modes-guide.md +1038 -1038
- package/docs/01-TEMPLATES/versori/workflows/extraction/extraction-workflows-guide.md +138 -138
- package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/graphql-extraction-guide.md +63 -63
- package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-fulfillments-to-sftp-csv.md +2062 -2062
- package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-fulfillments-to-sftp-xml.md +2294 -2294
- package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-inventory-positions-to-s3-csv.md +2461 -2461
- package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-inventory-positions-to-sftp-xml.md +2529 -2529
- package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-inventory-quantities-to-s3-csv.md +2464 -2464
- package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-inventory-quantities-to-s3-json.md +1959 -1959
- package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-orders-to-s3-csv.md +1953 -1953
- package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-orders-to-sftp-xml.md +2541 -2541
- package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-products-to-s3-json.md +2384 -2384
- package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-products-to-sftp-xml.md +2445 -2445
- package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-virtual-positions-to-s3-csv.md +2355 -2355
- package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-virtual-positions-to-s3-json.md +2042 -2042
- package/docs/01-TEMPLATES/versori/workflows/extraction/graphql-queries/template-extraction-virtual-positions-to-sftp-xml.md +2726 -2726
- package/docs/01-TEMPLATES/versori/workflows/ingestion/batch-api/batch-api-guide.md +206 -206
- package/docs/01-TEMPLATES/versori/workflows/ingestion/batch-api/template-ingestion-cycle-count-reconciliation.md +2030 -2030
- package/docs/01-TEMPLATES/versori/workflows/ingestion/batch-api/template-ingestion-multi-channel-inventory-sync.md +1882 -1882
- package/docs/01-TEMPLATES/versori/workflows/ingestion/batch-api/template-ingestion-s3-csv-inventory-batch.md +2827 -2827
- package/docs/01-TEMPLATES/versori/workflows/ingestion/batch-api/template-ingestion-s3-json-inventory-batch.md +1952 -1952
- package/docs/01-TEMPLATES/versori/workflows/ingestion/batch-api/template-ingestion-s3-xml-inventory-batch.md +3289 -3289
- package/docs/01-TEMPLATES/versori/workflows/ingestion/batch-api/template-ingestion-sftp-csv-inventory-batch.md +3064 -3064
- package/docs/01-TEMPLATES/versori/workflows/ingestion/batch-api/template-ingestion-sftp-json-inventory-batch.md +3238 -3238
- package/docs/01-TEMPLATES/versori/workflows/ingestion/batch-api/template-ingestion-sftp-xml-inventory-batch.md +2977 -2977
- package/docs/01-TEMPLATES/versori/workflows/ingestion/event-api/event-api-guide.md +321 -321
- package/docs/01-TEMPLATES/versori/workflows/ingestion/event-api/template-ingestion-payload-json-order-cancel-event.md +959 -959
- package/docs/01-TEMPLATES/versori/workflows/ingestion/event-api/template-ingestion-payload-xml-order-cancel-event.md +1170 -1170
- package/docs/01-TEMPLATES/versori/workflows/ingestion/event-api/template-ingestion-s3-csv-product-event.md +2312 -2312
- package/docs/01-TEMPLATES/versori/workflows/ingestion/event-api/template-ingestion-s3-json-product-event.md +2999 -2999
- package/docs/01-TEMPLATES/versori/workflows/ingestion/event-api/template-ingestion-s3-parquet-product-event.md +2836 -2836
- package/docs/01-TEMPLATES/versori/workflows/ingestion/event-api/template-ingestion-s3-xml-product-event.md +2395 -2395
- package/docs/01-TEMPLATES/versori/workflows/ingestion/event-api/template-ingestion-sftp-csv-product-event.md +2295 -2295
- package/docs/01-TEMPLATES/versori/workflows/ingestion/event-api/template-ingestion-sftp-json-product-event.md +2602 -2602
- package/docs/01-TEMPLATES/versori/workflows/ingestion/event-api/template-ingestion-sftp-parquet-product-event.md +2589 -2589
- package/docs/01-TEMPLATES/versori/workflows/ingestion/event-api/template-ingestion-sftp-xml-product-event.md +3578 -3578
- package/docs/01-TEMPLATES/versori/workflows/ingestion/graphql-mutations/graphql-mutations-guide.md +93 -93
- package/docs/01-TEMPLATES/versori/workflows/ingestion/graphql-mutations/template-ingestion-payload-json-order-update-graphql.md +1260 -1260
- package/docs/01-TEMPLATES/versori/workflows/ingestion/graphql-mutations/template-ingestion-payload-xml-order-update-graphql.md +1472 -1472
- package/docs/01-TEMPLATES/versori/workflows/ingestion/graphql-mutations/template-ingestion-s3-csv-control-graphql.md +2417 -2417
- package/docs/01-TEMPLATES/versori/workflows/ingestion/graphql-mutations/template-ingestion-s3-csv-location-graphql.md +2811 -2811
- package/docs/01-TEMPLATES/versori/workflows/ingestion/graphql-mutations/template-ingestion-s3-csv-price-graphql.md +2619 -2619
- package/docs/01-TEMPLATES/versori/workflows/ingestion/graphql-mutations/template-ingestion-s3-json-location-graphql.md +2807 -2807
- package/docs/01-TEMPLATES/versori/workflows/ingestion/graphql-mutations/template-ingestion-s3-xml-location-graphql.md +2373 -2373
- package/docs/01-TEMPLATES/versori/workflows/ingestion/graphql-mutations/template-ingestion-sftp-csv-control-graphql.md +2740 -2740
- package/docs/01-TEMPLATES/versori/workflows/ingestion/graphql-mutations/template-ingestion-sftp-csv-location-graphql.md +2760 -2760
- package/docs/01-TEMPLATES/versori/workflows/ingestion/graphql-mutations/template-ingestion-sftp-json-location-graphql.md +1710 -1710
- package/docs/01-TEMPLATES/versori/workflows/ingestion/ingestion-workflows-guide.md +136 -136
- package/docs/01-TEMPLATES/versori/workflows/rubix-webhooks/rubix-webhooks-guide.md +520 -520
- package/docs/01-TEMPLATES/versori/workflows/rubix-webhooks/template-webhook-rubix-fulfilment-to-sftp-xml-inline.md +1418 -1418
- package/docs/01-TEMPLATES/versori/workflows/rubix-webhooks/template-webhook-rubix-fulfilment-to-sftp-xml-universal-mapper.md +1785 -1785
- package/docs/01-TEMPLATES/versori/workflows/rubix-webhooks/template-webhook-rubix-order-attribute-update.md +824 -824
- package/docs/01-TEMPLATES/versori/workflows/workflows-overview-guide.md +646 -646
- package/docs/02-CORE-GUIDES/advanced-services/advanced-services-batch-archival.md +724 -724
- package/docs/02-CORE-GUIDES/advanced-services/advanced-services-job-tracker.md +627 -627
- package/docs/02-CORE-GUIDES/advanced-services/advanced-services-partial-batch-recovery.md +561 -561
- package/docs/02-CORE-GUIDES/advanced-services/advanced-services-quick-reference.md +367 -367
- package/docs/02-CORE-GUIDES/advanced-services/advanced-services-readme.md +407 -407
- package/docs/02-CORE-GUIDES/advanced-services/readme.md +49 -49
- package/docs/02-CORE-GUIDES/api-reference/api-reference-quick-reference.md +548 -548
- package/docs/02-CORE-GUIDES/api-reference/event-api-input-output-reference.md +702 -1171
- package/docs/02-CORE-GUIDES/api-reference/examples/client-initialization.ts +286 -286
- package/docs/02-CORE-GUIDES/api-reference/graphql-error-classification.md +337 -337
- package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-01-client-api.md +399 -520
- package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-03-authentication.md +199 -199
- package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-04-graphql-mapping.md +925 -925
- package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-05-services.md +1198 -1198
- package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-06-data-sources.md +1083 -1083
- package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-07-parsers.md +1097 -1097
- package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-08-pagination.md +513 -513
- package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-08-types.md +545 -597
- package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-09-error-handling.md +527 -527
- package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-09-webhook-validation.md +514 -514
- package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-10-extraction.md +557 -557
- package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-10-utilities.md +412 -412
- package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-11-cli-tools.md +423 -423
- package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-11-error-handling.md +716 -716
- package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-12-analyze-source-structure.md +518 -518
- package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-12-partial-responses.md +212 -212
- package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-12-testing.md +300 -300
- package/docs/02-CORE-GUIDES/api-reference/modules/api-reference-13-resolver-builder.md +322 -322
- package/docs/02-CORE-GUIDES/api-reference/readme.md +279 -279
- package/docs/02-CORE-GUIDES/auto-pagination/auto-pagination-quick-reference.md +351 -351
- package/docs/02-CORE-GUIDES/auto-pagination/auto-pagination-readme.md +277 -277
- package/docs/02-CORE-GUIDES/auto-pagination/examples/auto-pagination-readme.md +178 -178
- package/docs/02-CORE-GUIDES/auto-pagination/examples/common-patterns.ts +351 -351
- package/docs/02-CORE-GUIDES/auto-pagination/examples/paginate-products.ts +384 -384
- package/docs/02-CORE-GUIDES/auto-pagination/examples/paginate-virtual-positions.ts +308 -308
- package/docs/02-CORE-GUIDES/auto-pagination/modules/auto-pagination-01-foundations.md +470 -470
- package/docs/02-CORE-GUIDES/auto-pagination/modules/auto-pagination-02-quick-start.md +713 -713
- package/docs/02-CORE-GUIDES/auto-pagination/modules/auto-pagination-03-configuration.md +754 -754
- package/docs/02-CORE-GUIDES/auto-pagination/modules/auto-pagination-04-advanced-patterns.md +732 -732
- package/docs/02-CORE-GUIDES/auto-pagination/modules/auto-pagination-05-sdk-integration.md +847 -847
- package/docs/02-CORE-GUIDES/auto-pagination/modules/auto-pagination-06-troubleshooting.md +359 -359
- package/docs/02-CORE-GUIDES/auto-pagination/modules/auto-pagination-07-api-reference.md +462 -462
- package/docs/02-CORE-GUIDES/auto-pagination/readme.md +54 -54
- package/docs/02-CORE-GUIDES/data-sources/data-sources-file-operations-error-handling.md +1487 -1487
- package/docs/02-CORE-GUIDES/data-sources/data-sources-quick-reference.md +836 -836
- package/docs/02-CORE-GUIDES/data-sources/data-sources-readme.md +276 -276
- package/docs/02-CORE-GUIDES/data-sources/data-sources-sftp-credential-access-security.md +553 -553
- package/docs/02-CORE-GUIDES/data-sources/examples/common-patterns.ts +409 -409
- package/docs/02-CORE-GUIDES/data-sources/examples/data-sources-readme.md +178 -178
- package/docs/02-CORE-GUIDES/data-sources/examples/s3-operations.ts +308 -308
- package/docs/02-CORE-GUIDES/data-sources/examples/sftp-operations.ts +371 -371
- package/docs/02-CORE-GUIDES/data-sources/modules/data-sources-01-foundations.md +735 -735
- package/docs/02-CORE-GUIDES/data-sources/modules/data-sources-02-s3-operations.md +1302 -1302
- package/docs/02-CORE-GUIDES/data-sources/modules/data-sources-03-sftp-operations.md +1379 -1379
- package/docs/02-CORE-GUIDES/data-sources/modules/data-sources-04-file-patterns.md +941 -941
- package/docs/02-CORE-GUIDES/data-sources/modules/data-sources-05-advanced-topics.md +813 -813
- package/docs/02-CORE-GUIDES/data-sources/modules/data-sources-06-integration-patterns.md +486 -486
- package/docs/02-CORE-GUIDES/data-sources/modules/data-sources-07-troubleshooting.md +387 -387
- package/docs/02-CORE-GUIDES/data-sources/modules/data-sources-08-api-reference.md +417 -417
- package/docs/02-CORE-GUIDES/data-sources/readme.md +77 -77
- package/docs/02-CORE-GUIDES/error-handling-guide.md +936 -936
- package/docs/02-CORE-GUIDES/extraction/examples/02-core-guides-extraction-readme.md +116 -116
- package/docs/02-CORE-GUIDES/extraction/examples/common-patterns.ts +428 -428
- package/docs/02-CORE-GUIDES/extraction/examples/extract-inventory-basic.ts +187 -187
- package/docs/02-CORE-GUIDES/extraction/extraction-quick-reference.md +596 -596
- package/docs/02-CORE-GUIDES/extraction/modules/02-core-guides-extraction-01-foundations.md +514 -514
- package/docs/02-CORE-GUIDES/extraction/modules/02-core-guides-extraction-02-basic-extraction.md +823 -823
- package/docs/02-CORE-GUIDES/extraction/modules/02-core-guides-extraction-03-parquet-processing.md +507 -507
- package/docs/02-CORE-GUIDES/extraction/modules/02-core-guides-extraction-04-data-enrichment.md +546 -546
- package/docs/02-CORE-GUIDES/extraction/modules/02-core-guides-extraction-05-transformation.md +494 -494
- package/docs/02-CORE-GUIDES/extraction/modules/02-core-guides-extraction-06-export-formats.md +458 -458
- package/docs/02-CORE-GUIDES/extraction/modules/02-core-guides-extraction-06-performance.md +138 -138
- package/docs/02-CORE-GUIDES/extraction/modules/02-core-guides-extraction-07-api-reference.md +148 -148
- package/docs/02-CORE-GUIDES/extraction/modules/02-core-guides-extraction-07-optimization.md +692 -692
- package/docs/02-CORE-GUIDES/extraction/modules/02-core-guides-extraction-08-extraction-orchestrator.md +1008 -1008
- package/docs/02-CORE-GUIDES/extraction/readme.md +151 -151
- package/docs/02-CORE-GUIDES/ingestion/examples/_simple-kv-store.ts +40 -40
- package/docs/02-CORE-GUIDES/ingestion/examples/error-recovery.ts +728 -728
- package/docs/02-CORE-GUIDES/ingestion/examples/event-driven.ts +501 -501
- package/docs/02-CORE-GUIDES/ingestion/examples/local-file-ingestion.ts +88 -88
- package/docs/02-CORE-GUIDES/ingestion/examples/parquet-ingestion.ts +117 -117
- package/docs/02-CORE-GUIDES/ingestion/examples/performance-optimized.ts +647 -647
- package/docs/02-CORE-GUIDES/ingestion/examples/s3-csv-ingestion.ts +169 -169
- package/docs/02-CORE-GUIDES/ingestion/examples/sftp-csv-ingestion.ts +134 -134
- package/docs/02-CORE-GUIDES/ingestion/ingestion-quick-reference.md +546 -546
- package/docs/02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-01-introduction.md +626 -626
- package/docs/02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-02-quick-start.md +658 -658
- package/docs/02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-03-data-sources.md +1052 -1052
- package/docs/02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-04-field-mapping.md +763 -763
- package/docs/02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-05-advanced-parsers.md +676 -676
- package/docs/02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-06-batch-api.md +1295 -1295
- package/docs/02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-07-api-reference.md +138 -138
- package/docs/02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-07-state-management.md +1037 -1037
- package/docs/02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-08-performance-optimization.md +1349 -1349
- package/docs/02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-09-best-practices.md +1893 -1893
- package/docs/02-CORE-GUIDES/ingestion/readme.md +160 -160
- package/docs/02-CORE-GUIDES/logging-guide.md +585 -585
- package/docs/02-CORE-GUIDES/mapping/error-handling-patterns.md +401 -401
- package/docs/02-CORE-GUIDES/mapping/examples/02-core-guides-mapping-readme.md +128 -128
- package/docs/02-CORE-GUIDES/mapping/examples/common-patterns.ts +273 -273
- package/docs/02-CORE-GUIDES/mapping/examples/csv-location-ingestion.json +36 -36
- package/docs/02-CORE-GUIDES/mapping/examples/csv-mapping.ts +242 -242
- package/docs/02-CORE-GUIDES/mapping/examples/graphql-to-parquet-extraction.json +36 -36
- package/docs/02-CORE-GUIDES/mapping/examples/json-mapping.ts +213 -213
- package/docs/02-CORE-GUIDES/mapping/examples/json-product-to-mutation.json +48 -48
- package/docs/02-CORE-GUIDES/mapping/examples/xml-mapping.ts +291 -291
- package/docs/02-CORE-GUIDES/mapping/examples/xml-order-to-mutation.json +45 -45
- package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/graphql-mutation-mapping-quick-reference.md +463 -463
- package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/graphql-mutation-mapping-readme.md +227 -227
- package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-01-introduction.md +222 -222
- package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-02-quick-start.md +351 -351
- package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-03-schema-validation.md +569 -569
- package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-04-mapping-patterns.md +471 -471
- package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-05-configuration-reference.md +611 -611
- package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-06-advanced-xpath.md +148 -148
- package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-06-path-syntax.md +464 -464
- package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-07-api-reference.md +94 -94
- package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-07-array-handling.md +307 -307
- package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-08-custom-resolvers.md +544 -544
- package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-09-advanced-patterns.md +427 -427
- package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-10-hooks-and-variables.md +336 -336
- package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-11-error-handling.md +488 -488
- package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-12-arguments-vs-nodes.md +383 -383
- package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/modules/graphql-mutation-mapping-13-best-practices.md +477 -477
- package/docs/02-CORE-GUIDES/mapping/graphql-mutation-mapping/readme.md +62 -62
- package/docs/02-CORE-GUIDES/mapping/mapping-format-decision-tree.md +480 -480
- package/docs/02-CORE-GUIDES/mapping/mapping-graphql-alias-batching-guide.md +820 -820
- package/docs/02-CORE-GUIDES/mapping/mapping-javascript-objects.md +2369 -2369
- package/docs/02-CORE-GUIDES/mapping/mapping-mapper-comparison-guide.md +682 -682
- package/docs/02-CORE-GUIDES/mapping/modules/02-core-guides-mapping-07-api-reference.md +1327 -1327
- package/docs/02-CORE-GUIDES/mapping/modules/02-core-guides-mapping-08-error-handling.md +1142 -1142
- package/docs/02-CORE-GUIDES/mapping/modules/mapping-04-use-cases.md +891 -891
- package/docs/02-CORE-GUIDES/mapping/modules/mapping-06-helpers-resolvers.md +1126 -1126
- package/docs/02-CORE-GUIDES/mapping/modules/mapping-06-sdk-resolvers.md +199 -199
- package/docs/02-CORE-GUIDES/mapping/modules/mapping-07-api-reference.md +1319 -1319
- package/docs/02-CORE-GUIDES/mapping/readme.md +178 -178
- package/docs/02-CORE-GUIDES/mapping/resolver-registration.md +410 -410
- package/docs/02-CORE-GUIDES/mapping/resolvers/examples/common-patterns.ts +226 -226
- package/docs/02-CORE-GUIDES/mapping/resolvers/examples/custom-resolvers.ts +227 -227
- package/docs/02-CORE-GUIDES/mapping/resolvers/examples/sdk-resolvers-usage.ts +203 -203
- package/docs/02-CORE-GUIDES/mapping/resolvers/mapping-resolvers-readme.md +274 -274
- package/docs/02-CORE-GUIDES/mapping/resolvers/mapping-resolvers-resolver-api-reference.md +679 -679
- package/docs/02-CORE-GUIDES/mapping/resolvers/mapping-resolvers-resolver-cookbook.md +826 -826
- package/docs/02-CORE-GUIDES/mapping/resolvers/mapping-resolvers-resolver-guide.md +1330 -1330
- package/docs/02-CORE-GUIDES/mapping/resolvers/mapping-resolvers-resolver-helpers-reference.md +1437 -1437
- package/docs/02-CORE-GUIDES/mapping/resolvers/mapping-resolvers-resolver-parameters-reference.md +553 -553
- package/docs/02-CORE-GUIDES/mapping/resolvers/mapping-resolvers-resolver-troubleshooting.md +854 -854
- package/docs/02-CORE-GUIDES/mapping/resolvers/readme.md +75 -75
- package/docs/02-CORE-GUIDES/parsers/examples/02-core-guides-parsers-readme.md +161 -161
- package/docs/02-CORE-GUIDES/parsers/examples/csv-parser-examples.ts +110 -110
- package/docs/02-CORE-GUIDES/parsers/examples/json-parser-examples.ts +33 -33
- package/docs/02-CORE-GUIDES/parsers/examples/parquet-parser-examples.ts +47 -47
- package/docs/02-CORE-GUIDES/parsers/examples/xml-parser-examples.ts +38 -38
- package/docs/02-CORE-GUIDES/parsers/modules/02-core-guides-parsers-01-foundations.md +355 -355
- package/docs/02-CORE-GUIDES/parsers/modules/02-core-guides-parsers-02-csv-parser.md +772 -772
- package/docs/02-CORE-GUIDES/parsers/modules/02-core-guides-parsers-03-json-parser.md +789 -789
- package/docs/02-CORE-GUIDES/parsers/modules/02-core-guides-parsers-04-xml-parser.md +857 -857
- package/docs/02-CORE-GUIDES/parsers/modules/02-core-guides-parsers-05-parquet-parser.md +603 -603
- package/docs/02-CORE-GUIDES/parsers/modules/02-core-guides-parsers-06-integration-patterns.md +702 -702
- package/docs/02-CORE-GUIDES/parsers/modules/02-core-guides-parsers-06-streaming.md +121 -121
- package/docs/02-CORE-GUIDES/parsers/modules/02-core-guides-parsers-07-api-reference.md +89 -89
- package/docs/02-CORE-GUIDES/parsers/modules/02-core-guides-parsers-07-troubleshooting.md +727 -727
- package/docs/02-CORE-GUIDES/parsers/parsers-quick-reference.md +482 -482
- package/docs/02-CORE-GUIDES/parsers/parsers-readme.md +258 -258
- package/docs/02-CORE-GUIDES/parsers/readme.md +65 -65
- package/docs/02-CORE-GUIDES/readme.md +194 -194
- package/docs/02-CORE-GUIDES/webhook-validation/examples/basic-validation.ts +108 -108
- package/docs/02-CORE-GUIDES/webhook-validation/examples/common-patterns.ts +316 -316
- package/docs/02-CORE-GUIDES/webhook-validation/examples/webhook-validation-readme.md +61 -61
- package/docs/02-CORE-GUIDES/webhook-validation/modules/webhook-validation-01-foundations.md +440 -440
- package/docs/02-CORE-GUIDES/webhook-validation/modules/webhook-validation-02-quick-start.md +525 -525
- package/docs/02-CORE-GUIDES/webhook-validation/modules/webhook-validation-03-versori-integration.md +741 -741
- package/docs/02-CORE-GUIDES/webhook-validation/modules/webhook-validation-04-platform-integration.md +629 -629
- package/docs/02-CORE-GUIDES/webhook-validation/modules/webhook-validation-05-configuration.md +535 -535
- package/docs/02-CORE-GUIDES/webhook-validation/modules/webhook-validation-06-error-handling.md +611 -611
- package/docs/02-CORE-GUIDES/webhook-validation/modules/webhook-validation-06-troubleshooting.md +124 -124
- package/docs/02-CORE-GUIDES/webhook-validation/modules/webhook-validation-07-api-reference.md +511 -511
- package/docs/02-CORE-GUIDES/webhook-validation/modules/webhook-validation-08-rubix-webhooks.md +590 -590
- package/docs/02-CORE-GUIDES/webhook-validation/modules/webhook-validation-09-rubix-event-vs-http-call.md +432 -432
- package/docs/02-CORE-GUIDES/webhook-validation/readme.md +239 -239
- package/docs/02-CORE-GUIDES/webhook-validation/webhook-validation-quick-reference.md +392 -392
- package/docs/03-PATTERN-GUIDES/connector-scenarios/connector-scenarios-quick-reference.md +498 -498
- package/docs/03-PATTERN-GUIDES/connector-scenarios/connector-scenarios-readme.md +313 -313
- package/docs/03-PATTERN-GUIDES/connector-scenarios/examples/common-patterns.ts +612 -612
- package/docs/03-PATTERN-GUIDES/connector-scenarios/examples/connector-scenarios-readme.md +253 -253
- package/docs/03-PATTERN-GUIDES/connector-scenarios/modules/connector-scenarios-01-foundations.md +452 -452
- package/docs/03-PATTERN-GUIDES/connector-scenarios/modules/connector-scenarios-02-simple-scenarios.md +681 -681
- package/docs/03-PATTERN-GUIDES/connector-scenarios/modules/connector-scenarios-03-intermediate-scenarios.md +637 -637
- package/docs/03-PATTERN-GUIDES/connector-scenarios/modules/connector-scenarios-04-advanced-scenarios.md +650 -650
- package/docs/03-PATTERN-GUIDES/connector-scenarios/modules/connector-scenarios-05-bidirectional-sync.md +233 -233
- package/docs/03-PATTERN-GUIDES/connector-scenarios/modules/connector-scenarios-06-production-patterns.md +442 -442
- package/docs/03-PATTERN-GUIDES/connector-scenarios/modules/connector-scenarios-07-reference.md +445 -445
- package/docs/03-PATTERN-GUIDES/connector-scenarios/readme.md +31 -31
- package/docs/03-PATTERN-GUIDES/enterprise-integration-patterns.md +1528 -1528
- package/docs/03-PATTERN-GUIDES/error-handling/comprehensive-error-handling-guide.md +1437 -1437
- package/docs/03-PATTERN-GUIDES/error-handling/error-handling-quick-reference.md +390 -390
- package/docs/03-PATTERN-GUIDES/error-handling/examples/common-patterns.ts +438 -438
- package/docs/03-PATTERN-GUIDES/error-handling/modules/error-handling-01-foundations.md +362 -362
- package/docs/03-PATTERN-GUIDES/error-handling/modules/error-handling-02-error-types.md +850 -850
- package/docs/03-PATTERN-GUIDES/error-handling/modules/error-handling-03-utf8-handling.md +456 -456
- package/docs/03-PATTERN-GUIDES/error-handling/modules/error-handling-04-error-scenarios.md +658 -658
- package/docs/03-PATTERN-GUIDES/error-handling/modules/error-handling-05-calling-patterns.md +671 -671
- package/docs/03-PATTERN-GUIDES/error-handling/modules/error-handling-06-retry-strategies.md +1034 -1034
- package/docs/03-PATTERN-GUIDES/error-handling/modules/error-handling-07-monitoring.md +653 -653
- package/docs/03-PATTERN-GUIDES/error-handling/modules/error-handling-08-api-reference.md +847 -847
- package/docs/03-PATTERN-GUIDES/error-handling/readme.md +36 -36
- package/docs/03-PATTERN-GUIDES/examples/__tests__/readme.md +40 -40
- package/docs/03-PATTERN-GUIDES/examples/__tests__/resolver-examples.test.js +282 -282
- package/docs/03-PATTERN-GUIDES/examples/test-data/03-pattern-guides-readme.md +110 -110
- package/docs/03-PATTERN-GUIDES/examples/test-data/canonical-inventory.json +123 -123
- package/docs/03-PATTERN-GUIDES/examples/test-data/canonical-order.json +171 -171
- package/docs/03-PATTERN-GUIDES/examples/test-data/readme.md +28 -28
- package/docs/03-PATTERN-GUIDES/extraction/extraction-readme.md +15 -15
- package/docs/03-PATTERN-GUIDES/extraction/readme.md +25 -25
- package/docs/03-PATTERN-GUIDES/file-operations/examples/common-patterns.ts +407 -407
- package/docs/03-PATTERN-GUIDES/file-operations/examples/file-operations-readme.md +142 -142
- package/docs/03-PATTERN-GUIDES/file-operations/file-operations-quick-reference.md +462 -462
- package/docs/03-PATTERN-GUIDES/file-operations/file-operations-readme.md +379 -379
- package/docs/03-PATTERN-GUIDES/file-operations/modules/file-operations-01-foundations.md +430 -430
- package/docs/03-PATTERN-GUIDES/file-operations/modules/file-operations-02-quick-start.md +484 -484
- package/docs/03-PATTERN-GUIDES/file-operations/modules/file-operations-03-s3-operations.md +507 -507
- package/docs/03-PATTERN-GUIDES/file-operations/modules/file-operations-04-sftp-operations.md +963 -963
- package/docs/03-PATTERN-GUIDES/file-operations/modules/file-operations-05-streaming-performance.md +503 -503
- package/docs/03-PATTERN-GUIDES/file-operations/modules/file-operations-06-archive-patterns.md +386 -386
- package/docs/03-PATTERN-GUIDES/file-operations/modules/file-operations-06-error-handling.md +117 -117
- package/docs/03-PATTERN-GUIDES/file-operations/modules/file-operations-07-api-reference.md +78 -78
- package/docs/03-PATTERN-GUIDES/file-operations/modules/file-operations-07-testing-troubleshooting.md +567 -567
- package/docs/03-PATTERN-GUIDES/file-operations/modules/file-operations-08-api-reference.md +1055 -1055
- package/docs/03-PATTERN-GUIDES/file-operations/readme.md +32 -32
- package/docs/03-PATTERN-GUIDES/ingestion/ingestion-readme.md +15 -15
- package/docs/03-PATTERN-GUIDES/ingestion/readme.md +25 -25
- package/docs/03-PATTERN-GUIDES/integration-patterns/examples/batch-processing.ts +130 -130
- package/docs/03-PATTERN-GUIDES/integration-patterns/examples/common-patterns.ts +360 -360
- package/docs/03-PATTERN-GUIDES/integration-patterns/examples/delta-sync.ts +130 -130
- package/docs/03-PATTERN-GUIDES/integration-patterns/examples/integration-patterns-readme.md +100 -100
- package/docs/03-PATTERN-GUIDES/integration-patterns/examples/real-time-webhook.ts +398 -398
- package/docs/03-PATTERN-GUIDES/integration-patterns/integration-patterns-quick-reference.md +962 -962
- package/docs/03-PATTERN-GUIDES/integration-patterns/integration-patterns-readme.md +134 -134
- package/docs/03-PATTERN-GUIDES/integration-patterns/modules/integration-patterns-01-real-time-processing.md +991 -991
- package/docs/03-PATTERN-GUIDES/integration-patterns/modules/integration-patterns-02-batch-processing.md +1547 -1547
- package/docs/03-PATTERN-GUIDES/integration-patterns/modules/integration-patterns-03-delta-sync.md +1108 -1108
- package/docs/03-PATTERN-GUIDES/integration-patterns/modules/integration-patterns-04-webhook-patterns.md +1181 -1181
- package/docs/03-PATTERN-GUIDES/integration-patterns/modules/integration-patterns-05-error-handling.md +1061 -1061
- package/docs/03-PATTERN-GUIDES/integration-patterns/modules/integration-patterns-06-advanced-integration-services.md +1547 -1547
- package/docs/03-PATTERN-GUIDES/integration-patterns/modules/integration-patterns-06-performance.md +109 -109
- package/docs/03-PATTERN-GUIDES/integration-patterns/modules/integration-patterns-07-api-reference.md +34 -34
- package/docs/03-PATTERN-GUIDES/integration-patterns/readme.md +30 -30
- package/docs/03-PATTERN-GUIDES/logging-minimal-mode.md +128 -128
- package/docs/03-PATTERN-GUIDES/multiple-connections/examples/common-patterns.ts +380 -380
- package/docs/03-PATTERN-GUIDES/multiple-connections/examples/multiple-connections-readme.md +139 -139
- package/docs/03-PATTERN-GUIDES/multiple-connections/examples/parallel-root-connections.ts +149 -149
- package/docs/03-PATTERN-GUIDES/multiple-connections/examples/real-world-scenarios.ts +405 -405
- package/docs/03-PATTERN-GUIDES/multiple-connections/modules/multiple-connections-01-foundations.md +378 -378
- package/docs/03-PATTERN-GUIDES/multiple-connections/modules/multiple-connections-02-quick-start.md +566 -566
- package/docs/03-PATTERN-GUIDES/multiple-connections/modules/multiple-connections-03-targeting-connections.md +659 -659
- package/docs/03-PATTERN-GUIDES/multiple-connections/modules/multiple-connections-04-parallel-queries.md +656 -656
- package/docs/03-PATTERN-GUIDES/multiple-connections/modules/multiple-connections-05-best-practices.md +624 -624
- package/docs/03-PATTERN-GUIDES/multiple-connections/modules/multiple-connections-06-api-reference.md +824 -824
- package/docs/03-PATTERN-GUIDES/multiple-connections/modules/multiple-connections-06-versori.md +119 -119
- package/docs/03-PATTERN-GUIDES/multiple-connections/modules/multiple-connections-07-api-reference.md +87 -87
- package/docs/03-PATTERN-GUIDES/multiple-connections/multiple-connections-quick-reference.md +353 -353
- package/docs/03-PATTERN-GUIDES/multiple-connections/multiple-connections-readme.md +270 -270
- package/docs/03-PATTERN-GUIDES/multiple-connections/readme.md +30 -30
- package/docs/03-PATTERN-GUIDES/pagination/pagination-readme.md +14 -14
- package/docs/03-PATTERN-GUIDES/pagination/readme.md +24 -24
- package/docs/03-PATTERN-GUIDES/parquet/examples/common-patterns.ts +180 -180
- package/docs/03-PATTERN-GUIDES/parquet/examples/read-parquet.ts +48 -48
- package/docs/03-PATTERN-GUIDES/parquet/examples/write-parquet.ts +65 -65
- package/docs/03-PATTERN-GUIDES/parquet/modules/03-pattern-guides-parquet-01-introduction.md +393 -393
- package/docs/03-PATTERN-GUIDES/parquet/modules/03-pattern-guides-parquet-02-quick-start.md +572 -572
- package/docs/03-PATTERN-GUIDES/parquet/modules/03-pattern-guides-parquet-03-reading-parquet.md +525 -525
- package/docs/03-PATTERN-GUIDES/parquet/modules/03-pattern-guides-parquet-04-writing-parquet.md +554 -554
- package/docs/03-PATTERN-GUIDES/parquet/modules/03-pattern-guides-parquet-05-graphql-extraction.md +405 -405
- package/docs/03-PATTERN-GUIDES/parquet/modules/03-pattern-guides-parquet-06-performance.md +104 -104
- package/docs/03-PATTERN-GUIDES/parquet/modules/03-pattern-guides-parquet-06-s3-integration.md +511 -511
- package/docs/03-PATTERN-GUIDES/parquet/modules/03-pattern-guides-parquet-07-api-reference.md +90 -90
- package/docs/03-PATTERN-GUIDES/parquet/modules/03-pattern-guides-parquet-07-performance-optimization.md +525 -525
- package/docs/03-PATTERN-GUIDES/parquet/modules/03-pattern-guides-parquet-08-best-practices.md +712 -712
- package/docs/03-PATTERN-GUIDES/parquet/parquet-quick-reference.md +683 -683
- package/docs/03-PATTERN-GUIDES/parquet/parquet-readme.md +248 -248
- package/docs/03-PATTERN-GUIDES/parquet/readme.md +32 -32
- package/docs/03-PATTERN-GUIDES/parsers/parsers-readme.md +12 -12
- package/docs/03-PATTERN-GUIDES/parsers/readme.md +24 -24
- package/docs/03-PATTERN-GUIDES/readme.md +159 -159
- package/docs/03-PATTERN-GUIDES/webhooks/readme.md +24 -24
- package/docs/03-PATTERN-GUIDES/webhooks/webhooks-readme.md +8 -8
- package/docs/04-REFERENCE/architecture/architecture-01-overview.md +427 -427
- package/docs/04-REFERENCE/architecture/architecture-02-client-architecture.md +424 -424
- package/docs/04-REFERENCE/architecture/architecture-03-data-flow.md +690 -690
- package/docs/04-REFERENCE/architecture/architecture-04-service-layer.md +834 -834
- package/docs/04-REFERENCE/architecture/architecture-05-integration-architecture.md +655 -655
- package/docs/04-REFERENCE/architecture/architecture-06-state-management.md +653 -653
- package/docs/04-REFERENCE/architecture/architecture-adding-new-data-sources.md +686 -686
- package/docs/04-REFERENCE/architecture/readme.md +279 -279
- package/docs/04-REFERENCE/platforms/deno/readme.md +117 -117
- package/docs/04-REFERENCE/platforms/nodejs/readme.md +146 -146
- package/docs/04-REFERENCE/platforms/readme.md +135 -135
- package/docs/04-REFERENCE/platforms/versori/modules/platforms-versori-01-introduction.md +398 -398
- package/docs/04-REFERENCE/platforms/versori/modules/platforms-versori-02-quick-start.md +560 -560
- package/docs/04-REFERENCE/platforms/versori/modules/platforms-versori-03-authentication.md +757 -757
- package/docs/04-REFERENCE/platforms/versori/modules/platforms-versori-04-workflows.md +2476 -2476
- package/docs/04-REFERENCE/platforms/versori/modules/platforms-versori-05-connections.md +1167 -1167
- package/docs/04-REFERENCE/platforms/versori/modules/platforms-versori-06-kv-storage.md +990 -990
- package/docs/04-REFERENCE/platforms/versori/modules/platforms-versori-06-state-management.md +121 -121
- package/docs/04-REFERENCE/platforms/versori/modules/platforms-versori-07-api-reference.md +68 -68
- package/docs/04-REFERENCE/platforms/versori/modules/platforms-versori-07-deployment.md +731 -731
- package/docs/04-REFERENCE/platforms/versori/modules/platforms-versori-08-best-practices.md +1111 -1111
- package/docs/04-REFERENCE/platforms/versori/modules/platforms-versori-09-signature-reference.md +766 -766
- package/docs/04-REFERENCE/platforms/versori/platforms-versori-readme.md +299 -299
- package/docs/04-REFERENCE/platforms/versori/platforms-versori-s3-sftp-configuration-guide.md +1425 -1425
- package/docs/04-REFERENCE/platforms/versori/platforms-versori-webhook-api-key-security.md +816 -816
- package/docs/04-REFERENCE/platforms/versori/platforms-versori-webhook-connection-security.md +681 -681
- package/docs/04-REFERENCE/platforms/versori/platforms-versori-workflow-task-types.md +708 -708
- package/docs/04-REFERENCE/platforms/versori/readme.md +108 -108
- package/docs/04-REFERENCE/readme.md +148 -148
- package/docs/04-REFERENCE/resolver-signature/examples/advanced-resolvers.ts +482 -482
- package/docs/04-REFERENCE/resolver-signature/examples/async-resolvers.ts +496 -496
- package/docs/04-REFERENCE/resolver-signature/examples/basic-resolvers.ts +343 -343
- package/docs/04-REFERENCE/resolver-signature/examples/resolver-signature-readme.md +188 -188
- package/docs/04-REFERENCE/resolver-signature/examples/testing-resolvers.ts +463 -463
- package/docs/04-REFERENCE/resolver-signature/modules/resolver-signature-01-foundations.md +286 -286
- package/docs/04-REFERENCE/resolver-signature/modules/resolver-signature-02-parameter-reference.md +643 -643
- package/docs/04-REFERENCE/resolver-signature/modules/resolver-signature-03-basic-examples.md +521 -521
- package/docs/04-REFERENCE/resolver-signature/modules/resolver-signature-04-advanced-patterns.md +739 -739
- package/docs/04-REFERENCE/resolver-signature/modules/resolver-signature-05-sdk-resolvers.md +531 -531
- package/docs/04-REFERENCE/resolver-signature/modules/resolver-signature-06-migration-guide.md +650 -650
- package/docs/04-REFERENCE/resolver-signature/modules/resolver-signature-06-testing.md +125 -125
- package/docs/04-REFERENCE/resolver-signature/modules/resolver-signature-07-api-reference.md +794 -794
- package/docs/04-REFERENCE/resolver-signature/readme.md +64 -64
- package/docs/04-REFERENCE/resolver-signature/resolver-signature-quick-reference.md +270 -270
- package/docs/04-REFERENCE/resolver-signature/resolver-signature-readme.md +351 -351
- package/docs/04-REFERENCE/schema/fluent-commerce-schema.json +764 -764
- package/docs/04-REFERENCE/schema/readme.md +141 -141
- package/docs/04-REFERENCE/testing/examples/04-reference-testing-readme.md +158 -158
- package/docs/04-REFERENCE/testing/examples/fluent-testing.ts +62 -62
- package/docs/04-REFERENCE/testing/examples/health-check.ts +155 -155
- package/docs/04-REFERENCE/testing/examples/integration-test.ts +119 -119
- package/docs/04-REFERENCE/testing/examples/performance-test.ts +183 -183
- package/docs/04-REFERENCE/testing/examples/s3-testing.ts +127 -127
- package/docs/04-REFERENCE/testing/modules/04-reference-testing-01-foundations.md +267 -267
- package/docs/04-REFERENCE/testing/modules/04-reference-testing-02-s3-testing.md +599 -599
- package/docs/04-REFERENCE/testing/modules/04-reference-testing-03-fluent-testing.md +589 -589
- package/docs/04-REFERENCE/testing/modules/04-reference-testing-04-integration-testing.md +699 -699
- package/docs/04-REFERENCE/testing/modules/04-reference-testing-05-debugging.md +478 -478
- package/docs/04-REFERENCE/testing/modules/04-reference-testing-06-cicd-integration.md +463 -463
- package/docs/04-REFERENCE/testing/modules/04-reference-testing-06-preflight-validation.md +131 -131
- package/docs/04-REFERENCE/testing/modules/04-reference-testing-07-best-practices.md +499 -499
- package/docs/04-REFERENCE/testing/modules/04-reference-testing-07-coverage-ci.md +165 -165
- package/docs/04-REFERENCE/testing/modules/04-reference-testing-08-api-reference.md +634 -634
- package/docs/04-REFERENCE/testing/readme.md +86 -86
- package/docs/04-REFERENCE/testing/testing-quick-reference.md +667 -667
- package/docs/04-REFERENCE/testing/testing-readme.md +286 -286
- package/docs/04-REFERENCE/troubleshooting/readme.md +144 -144
- package/docs/04-REFERENCE/troubleshooting/troubleshooting-deno-sftp-compatibility.md +392 -392
- package/docs/template-loading-matrix.md +242 -242
- package/package.json +5 -3
- package/docs/02-CORE-GUIDES/api-reference/cli-profile-integration.md +0 -377
|
@@ -1,1349 +1,1349 @@
|
|
|
1
|
-
# Module 8: Performance Optimization
|
|
2
|
-
|
|
3
|
-
**Level:** Advanced
|
|
4
|
-
**Estimated Time:** 25 minutes
|
|
5
|
-
|
|
6
|
-
## Overview
|
|
7
|
-
|
|
8
|
-
This module teaches you how to optimize ingestion performance through batch sizing strategies, parallel processing, job strategy selection, and memory management.
|
|
9
|
-
|
|
10
|
-
## Learning Objectives
|
|
11
|
-
|
|
12
|
-
By the end of this module, you will:
|
|
13
|
-
- ✅ Choose optimal batch sizes for your data
|
|
14
|
-
- ✅ Implement parallel file processing with concurrency control
|
|
15
|
-
- ✅ Select the right job strategy (DAILY vs PER_FILE)
|
|
16
|
-
- ✅ Optimize memory usage with streaming
|
|
17
|
-
- ✅ Monitor and tune ingestion performance
|
|
18
|
-
|
|
19
|
-
## Batch Sizing Strategy
|
|
20
|
-
|
|
21
|
-
### Understanding Batch Sizes
|
|
22
|
-
|
|
23
|
-
| Batch Size | Use Case | Pros | Cons |
|
|
24
|
-
|------------|----------|------|------|
|
|
25
|
-
| **100-500** | Testing, small datasets | Fast feedback, easy debugging | More API calls, slower overall |
|
|
26
|
-
| **1000-2000** | Standard ingestion | Good balance, reasonable memory | - |
|
|
27
|
-
| **5000-10000** | Large datasets, performance | Fewer API calls, faster | Higher memory, longer retries |
|
|
28
|
-
|
|
29
|
-
### Performance Benchmarks by Batch Size
|
|
30
|
-
|
|
31
|
-
Real-world performance metrics from production ingestion workflows:
|
|
32
|
-
|
|
33
|
-
| Batch Size | Records/Second | Time for 10K Records | Memory Usage | Network Calls | Recommended For |
|
|
34
|
-
|------------|----------------|----------------------|--------------|---------------|-----------------|
|
|
35
|
-
| **100** | 150-200 | ~50-60s | 50-80 MB | 100 | Development, debugging |
|
|
36
|
-
| **500** | 400-500 | ~20-25s | 80-120 MB | 20 | Testing, small datasets |
|
|
37
|
-
| **1,000** | 800-1,000 | ~10-12s | 120-180 MB | 10 | Standard ingestion |
|
|
38
|
-
| **2,000** | 1,400-1,600 | ~6-7s | 180-250 MB | 5 | High-volume ingestion |
|
|
39
|
-
| **5,000** | 2,200-2,500 | ~4-5s | 300-450 MB | 2 | Large file processing |
|
|
40
|
-
| **10,000** | 2,800-3,200 | ~3-4s | 500-800 MB | 1 | Maximum performance |
|
|
41
|
-
|
|
42
|
-
**Key Insights:**
|
|
43
|
-
- **Optimal range:** 1,000-2,000 records balances speed, memory, and error recovery
|
|
44
|
-
- **Diminishing returns:** Beyond 5,000 records, performance gains are minimal
|
|
45
|
-
- **Memory considerations:** Each batch occupies ~0.08-0.1 MB per record in memory
|
|
46
|
-
- **Network efficiency:** Larger batches reduce API calls but increase retry cost on failure
|
|
47
|
-
|
|
48
|
-
### Calculating Optimal Batch Size
|
|
49
|
-
|
|
50
|
-
```typescript
|
|
51
|
-
function calculateOptimalBatchSize(
|
|
52
|
-
totalRecords: number,
|
|
53
|
-
avgRecordSizeBytes: number
|
|
54
|
-
): number {
|
|
55
|
-
const MAX_BATCH_SIZE = 10000;
|
|
56
|
-
const MAX_PAYLOAD_SIZE = 10 * 1024 * 1024; // 10 MB
|
|
57
|
-
const SAFETY_FACTOR = 0.8; // 80% of max
|
|
58
|
-
|
|
59
|
-
// Calculate records that fit in payload
|
|
60
|
-
const recordsPerPayload = Math.floor(
|
|
61
|
-
(MAX_PAYLOAD_SIZE * SAFETY_FACTOR) / avgRecordSizeBytes
|
|
62
|
-
);
|
|
63
|
-
|
|
64
|
-
// Return smaller of payload limit or max batch size
|
|
65
|
-
return Math.min(recordsPerPayload, MAX_BATCH_SIZE);
|
|
66
|
-
}
|
|
67
|
-
|
|
68
|
-
// Usage
|
|
69
|
-
const avgSize = JSON.stringify(sampleRecord).length;
|
|
70
|
-
const batchSize = calculateOptimalBatchSize(10000, avgSize);
|
|
71
|
-
console.log(\`Optimal batch size: \${batchSize}\`);
|
|
72
|
-
```
|
|
73
|
-
|
|
74
|
-
### Dynamic Batch Sizing
|
|
75
|
-
|
|
76
|
-
```typescript
|
|
77
|
-
async function dynamicBatchIngestion(
|
|
78
|
-
client: FluentClient,
|
|
79
|
-
records: any[]
|
|
80
|
-
) {
|
|
81
|
-
const batches = [];
|
|
82
|
-
let currentBatch = [];
|
|
83
|
-
let currentSize = 0;
|
|
84
|
-
const TARGET_SIZE = 5 * 1024 * 1024; // 5 MB target
|
|
85
|
-
|
|
86
|
-
for (const record of records) {
|
|
87
|
-
const recordSize = JSON.stringify(record).length;
|
|
88
|
-
|
|
89
|
-
// Start new batch if adding this record exceeds target
|
|
90
|
-
if (currentSize + recordSize > TARGET_SIZE && currentBatch.length > 0) {
|
|
91
|
-
batches.push([...currentBatch]);
|
|
92
|
-
currentBatch = [];
|
|
93
|
-
currentSize = 0;
|
|
94
|
-
}
|
|
95
|
-
|
|
96
|
-
currentBatch.push(record);
|
|
97
|
-
currentSize += recordSize;
|
|
98
|
-
}
|
|
99
|
-
|
|
100
|
-
// Add final batch
|
|
101
|
-
if (currentBatch.length > 0) {
|
|
102
|
-
batches.push(currentBatch);
|
|
103
|
-
}
|
|
104
|
-
|
|
105
|
-
console.log(\`Created \${batches.length} batches from \${records.length} records\`);
|
|
106
|
-
return batches;
|
|
107
|
-
}
|
|
108
|
-
```
|
|
109
|
-
|
|
110
|
-
### Adaptive Batch Sizing with Performance Feedback
|
|
111
|
-
|
|
112
|
-
Automatically adjust batch size based on processing performance:
|
|
113
|
-
|
|
114
|
-
```typescript
|
|
115
|
-
import { FluentClient } from '@fluentcommerce/fc-connect-sdk';
|
|
116
|
-
|
|
117
|
-
interface AdaptiveBatchConfig {
|
|
118
|
-
minBatchSize: number;
|
|
119
|
-
maxBatchSize: number;
|
|
120
|
-
targetProcessingTimeMs: number;
|
|
121
|
-
adjustmentFactor: number;
|
|
122
|
-
}
|
|
123
|
-
|
|
124
|
-
class AdaptiveBatchProcessor {
|
|
125
|
-
private currentBatchSize: number;
|
|
126
|
-
private processingHistory: number[] = [];
|
|
127
|
-
|
|
128
|
-
constructor(private config: AdaptiveBatchConfig) {
|
|
129
|
-
this.currentBatchSize = Math.floor(
|
|
130
|
-
(config.minBatchSize + config.maxBatchSize) / 2
|
|
131
|
-
);
|
|
132
|
-
}
|
|
133
|
-
|
|
134
|
-
/**
|
|
135
|
-
* Process records with adaptive batch sizing
|
|
136
|
-
*/
|
|
137
|
-
async processWithAdaptiveSizing(
|
|
138
|
-
client: FluentClient,
|
|
139
|
-
jobId: string,
|
|
140
|
-
records: any[]
|
|
141
|
-
): Promise<void> {
|
|
142
|
-
let processedCount = 0;
|
|
143
|
-
|
|
144
|
-
while (processedCount < records.length) {
|
|
145
|
-
const batch = records.slice(
|
|
146
|
-
processedCount,
|
|
147
|
-
processedCount + this.currentBatchSize
|
|
148
|
-
);
|
|
149
|
-
|
|
150
|
-
const startTime = Date.now();
|
|
151
|
-
|
|
152
|
-
try {
|
|
153
|
-
await client.sendBatch(jobId, {
|
|
154
|
-
action: 'UPSERT',
|
|
155
|
-
entityType: 'INVENTORY',
|
|
156
|
-
entities: batch
|
|
157
|
-
});
|
|
158
|
-
|
|
159
|
-
const processingTime = Date.now() - startTime;
|
|
160
|
-
this.adjustBatchSize(processingTime, batch.length);
|
|
161
|
-
|
|
162
|
-
processedCount += batch.length;
|
|
163
|
-
|
|
164
|
-
console.log(
|
|
165
|
-
`Processed ${batch.length} records in ${processingTime}ms. ` +
|
|
166
|
-
`New batch size: ${this.currentBatchSize}`
|
|
167
|
-
);
|
|
168
|
-
|
|
169
|
-
} catch (error) {
|
|
170
|
-
console.error(`Batch failed, reducing batch size`);
|
|
171
|
-
this.currentBatchSize = Math.max(
|
|
172
|
-
this.config.minBatchSize,
|
|
173
|
-
Math.floor(this.currentBatchSize * 0.5)
|
|
174
|
-
);
|
|
175
|
-
// Don't increment processedCount - retry this batch
|
|
176
|
-
}
|
|
177
|
-
}
|
|
178
|
-
}
|
|
179
|
-
|
|
180
|
-
/**
|
|
181
|
-
* Adjust batch size based on processing time
|
|
182
|
-
*/
|
|
183
|
-
private adjustBatchSize(processingTimeMs: number, recordsProcessed: number): void {
|
|
184
|
-
this.processingHistory.push(processingTimeMs);
|
|
185
|
-
|
|
186
|
-
// Keep last 10 measurements
|
|
187
|
-
if (this.processingHistory.length > 10) {
|
|
188
|
-
this.processingHistory.shift();
|
|
189
|
-
}
|
|
190
|
-
|
|
191
|
-
const avgProcessingTime =
|
|
192
|
-
this.processingHistory.reduce((a, b) => a + b, 0) /
|
|
193
|
-
this.processingHistory.length;
|
|
194
|
-
|
|
195
|
-
// If processing is faster than target, increase batch size
|
|
196
|
-
if (avgProcessingTime < this.config.targetProcessingTimeMs * 0.8) {
|
|
197
|
-
this.currentBatchSize = Math.min(
|
|
198
|
-
this.config.maxBatchSize,
|
|
199
|
-
Math.floor(this.currentBatchSize * (1 + this.config.adjustmentFactor))
|
|
200
|
-
);
|
|
201
|
-
}
|
|
202
|
-
// If processing is slower than target, decrease batch size
|
|
203
|
-
else if (avgProcessingTime > this.config.targetProcessingTimeMs * 1.2) {
|
|
204
|
-
this.currentBatchSize = Math.max(
|
|
205
|
-
this.config.minBatchSize,
|
|
206
|
-
Math.floor(this.currentBatchSize * (1 - this.config.adjustmentFactor))
|
|
207
|
-
);
|
|
208
|
-
}
|
|
209
|
-
// Otherwise maintain current size
|
|
210
|
-
}
|
|
211
|
-
|
|
212
|
-
/**
|
|
213
|
-
* Get current batch size
|
|
214
|
-
*/
|
|
215
|
-
getCurrentBatchSize(): number {
|
|
216
|
-
return this.currentBatchSize;
|
|
217
|
-
}
|
|
218
|
-
}
|
|
219
|
-
|
|
220
|
-
// Usage example
|
|
221
|
-
const adaptiveProcessor = new AdaptiveBatchProcessor({
|
|
222
|
-
minBatchSize: 500,
|
|
223
|
-
maxBatchSize: 5000,
|
|
224
|
-
targetProcessingTimeMs: 3000, // Target 3 seconds per batch
|
|
225
|
-
adjustmentFactor: 0.2 // Adjust by 20% each time
|
|
226
|
-
});
|
|
227
|
-
|
|
228
|
-
await adaptiveProcessor.processWithAdaptiveSizing(client, jobId, allRecords);
|
|
229
|
-
```
|
|
230
|
-
|
|
231
|
-
## Parallel Processing
|
|
232
|
-
|
|
233
|
-
### Concurrency Control
|
|
234
|
-
|
|
235
|
-
```typescript
|
|
236
|
-
async function parallelFileProcessing(
|
|
237
|
-
files: string[],
|
|
238
|
-
concurrency: number = 5
|
|
239
|
-
): Promise<void> {
|
|
240
|
-
const queue = [...files];
|
|
241
|
-
const results = [];
|
|
242
|
-
|
|
243
|
-
// Create worker pool
|
|
244
|
-
const workers = Array(concurrency)
|
|
245
|
-
.fill(null)
|
|
246
|
-
.map(async () => {
|
|
247
|
-
while (queue.length > 0) {
|
|
248
|
-
const file = queue.shift();
|
|
249
|
-
if (file) {
|
|
250
|
-
try {
|
|
251
|
-
const result = await processFile(file);
|
|
252
|
-
results.push({ file, status: 'success', result });
|
|
253
|
-
} catch (error) {
|
|
254
|
-
results.push({ file, status: 'error', error: error.message });
|
|
255
|
-
}
|
|
256
|
-
}
|
|
257
|
-
}
|
|
258
|
-
});
|
|
259
|
-
|
|
260
|
-
// Wait for all workers to complete
|
|
261
|
-
await Promise.all(workers);
|
|
262
|
-
|
|
263
|
-
console.log(\`Processed \${results.length} files with \${concurrency} workers\`);
|
|
264
|
-
return results;
|
|
265
|
-
}
|
|
266
|
-
```
|
|
267
|
-
|
|
268
|
-
### Production Parallel Processing with Error Handling
|
|
269
|
-
|
|
270
|
-
Complete parallel processing implementation with retry logic and error isolation:
|
|
271
|
-
|
|
272
|
-
```typescript
|
|
273
|
-
import {
|
|
274
|
-
createClient,
|
|
275
|
-
S3DataSource,
|
|
276
|
-
CSVParserService,
|
|
277
|
-
UniversalMapper,
|
|
278
|
-
StateService,
|
|
279
|
-
FluentClient
|
|
280
|
-
} from '@fluentcommerce/fc-connect-sdk';
|
|
281
|
-
|
|
282
|
-
interface ProcessingResult {
|
|
283
|
-
fileName: string;
|
|
284
|
-
status: 'success' | 'error' | 'skipped';
|
|
285
|
-
recordsProcessed?: number;
|
|
286
|
-
error?: string;
|
|
287
|
-
processingTimeMs?: number;
|
|
288
|
-
}
|
|
289
|
-
|
|
290
|
-
interface ParallelIngestionConfig {
|
|
291
|
-
concurrency: number;
|
|
292
|
-
batchSize: number;
|
|
293
|
-
maxRetries: number;
|
|
294
|
-
retryDelayMs: number;
|
|
295
|
-
}
|
|
296
|
-
|
|
297
|
-
class ParallelIngestionProcessor {
|
|
298
|
-
constructor(
|
|
299
|
-
private client: FluentClient,
|
|
300
|
-
private s3: S3DataSource,
|
|
301
|
-
private parser: CSVParserService,
|
|
302
|
-
private mapper: UniversalMapper,
|
|
303
|
-
private state: StateService,
|
|
304
|
-
private kv: KVStore,
|
|
305
|
-
private config: ParallelIngestionConfig
|
|
306
|
-
) {}
|
|
307
|
-
|
|
308
|
-
/**
|
|
309
|
-
* Process multiple files in parallel with concurrency control
|
|
310
|
-
*/
|
|
311
|
-
async processFiles(
|
|
312
|
-
bucket: string,
|
|
313
|
-
fileKeys: string[]
|
|
314
|
-
): Promise<ProcessingResult[]> {
|
|
315
|
-
const results: ProcessingResult[] = [];
|
|
316
|
-
const queue = [...fileKeys];
|
|
317
|
-
let activeWorkers = 0;
|
|
318
|
-
const maxWorkers = this.config.concurrency;
|
|
319
|
-
|
|
320
|
-
console.log(`Processing ${fileKeys.length} files with ${maxWorkers} workers`);
|
|
321
|
-
|
|
322
|
-
// Create worker pool
|
|
323
|
-
const workers = Array(maxWorkers)
|
|
324
|
-
.fill(null)
|
|
325
|
-
.map(async (_, workerId) => {
|
|
326
|
-
while (queue.length > 0) {
|
|
327
|
-
const fileKey = queue.shift();
|
|
328
|
-
if (!fileKey) break;
|
|
329
|
-
|
|
330
|
-
activeWorkers++;
|
|
331
|
-
console.log(`Worker ${workerId}: Processing ${fileKey} (${activeWorkers} active)`);
|
|
332
|
-
|
|
333
|
-
const result = await this.processFileWithRetry(bucket, fileKey);
|
|
334
|
-
results.push(result);
|
|
335
|
-
|
|
336
|
-
activeWorkers--;
|
|
337
|
-
console.log(`Worker ${workerId}: Completed ${fileKey} (${activeWorkers} active)`);
|
|
338
|
-
}
|
|
339
|
-
});
|
|
340
|
-
|
|
341
|
-
// Wait for all workers to finish
|
|
342
|
-
await Promise.all(workers);
|
|
343
|
-
|
|
344
|
-
return results;
|
|
345
|
-
}
|
|
346
|
-
|
|
347
|
-
/**
|
|
348
|
-
* Process a single file with retry logic
|
|
349
|
-
*/
|
|
350
|
-
private async processFileWithRetry(
|
|
351
|
-
bucket: string,
|
|
352
|
-
fileKey: string
|
|
353
|
-
): Promise<ProcessingResult> {
|
|
354
|
-
const startTime = Date.now();
|
|
355
|
-
let lastError: Error | null = null;
|
|
356
|
-
|
|
357
|
-
// Check if already processed
|
|
358
|
-
if (await this.state.isFileProcessed(this.kv, fileKey)) {
|
|
359
|
-
return {
|
|
360
|
-
fileName: fileKey,
|
|
361
|
-
status: 'skipped',
|
|
362
|
-
processingTimeMs: Date.now() - startTime
|
|
363
|
-
};
|
|
364
|
-
}
|
|
365
|
-
|
|
366
|
-
// Retry loop
|
|
367
|
-
for (let attempt = 1; attempt <= this.config.maxRetries; attempt++) {
|
|
368
|
-
try {
|
|
369
|
-
const recordsProcessed = await this.processFileSingle(bucket, fileKey);
|
|
370
|
-
|
|
371
|
-
return {
|
|
372
|
-
fileName: fileKey,
|
|
373
|
-
status: 'success',
|
|
374
|
-
recordsProcessed,
|
|
375
|
-
processingTimeMs: Date.now() - startTime
|
|
376
|
-
};
|
|
377
|
-
|
|
378
|
-
} catch (error) {
|
|
379
|
-
lastError = error as Error;
|
|
380
|
-
console.error(
|
|
381
|
-
`Attempt ${attempt}/${this.config.maxRetries} failed for ${fileKey}: ${error.message}`
|
|
382
|
-
);
|
|
383
|
-
|
|
384
|
-
if (attempt < this.config.maxRetries) {
|
|
385
|
-
const delay = this.config.retryDelayMs * attempt;
|
|
386
|
-
console.log(`Retrying ${fileKey} in ${delay}ms...`);
|
|
387
|
-
await new Promise(resolve => setTimeout(resolve, delay));
|
|
388
|
-
}
|
|
389
|
-
}
|
|
390
|
-
}
|
|
391
|
-
|
|
392
|
-
// All retries exhausted
|
|
393
|
-
return {
|
|
394
|
-
fileName: fileKey,
|
|
395
|
-
status: 'error',
|
|
396
|
-
error: lastError?.message || 'Unknown error',
|
|
397
|
-
processingTimeMs: Date.now() - startTime
|
|
398
|
-
};
|
|
399
|
-
}
|
|
400
|
-
|
|
401
|
-
/**
|
|
402
|
-
* Process a single file
|
|
403
|
-
*/
|
|
404
|
-
private async processFileSingle(
|
|
405
|
-
bucket: string,
|
|
406
|
-
fileKey: string
|
|
407
|
-
): Promise<number> {
|
|
408
|
-
// Read file from S3
|
|
409
|
-
const fileContent = await this.s3.downloadFile(fileKey);
|
|
410
|
-
|
|
411
|
-
// Parse CSV
|
|
412
|
-
const records = await this.parser.parse(fileContent);
|
|
413
|
-
|
|
414
|
-
if (records.length === 0) {
|
|
415
|
-
throw new Error('File contains no records');
|
|
416
|
-
}
|
|
417
|
-
|
|
418
|
-
// Map fields
|
|
419
|
-
const mappingResult = await this.mapper.map(records);
|
|
420
|
-
|
|
421
|
-
if (!mappingResult.success || mappingResult.data.length === 0) {
|
|
422
|
-
throw new Error(`Field mapping failed: ${mappingResult.errors.join(', ')}`);
|
|
423
|
-
}
|
|
424
|
-
|
|
425
|
-
// Create job
|
|
426
|
-
const job = await this.client.createJob({
|
|
427
|
-
name: `Import - ${fileKey}`,
|
|
428
|
-
retailerId: process.env.FLUENT_RETAILER_ID!,
|
|
429
|
-
metadata: {
|
|
430
|
-
fileName: fileKey,
|
|
431
|
-
recordCount: mappingResult.data.length
|
|
432
|
-
}
|
|
433
|
-
});
|
|
434
|
-
|
|
435
|
-
// Send in batches
|
|
436
|
-
const batches = this.createBatches(mappingResult.data, this.config.batchSize);
|
|
437
|
-
|
|
438
|
-
for (const [index, batch] of batches.entries()) {
|
|
439
|
-
await this.client.sendBatch(job.id, {
|
|
440
|
-
action: 'UPSERT',
|
|
441
|
-
entityType: 'INVENTORY',
|
|
442
|
-
entities: batch
|
|
443
|
-
});
|
|
444
|
-
|
|
445
|
-
console.log(`Sent batch ${index + 1}/${batches.length} for ${fileKey}`);
|
|
446
|
-
}
|
|
447
|
-
|
|
448
|
-
// Mark as processed
|
|
449
|
-
await this.state.markFileProcessed(fileKey, {
|
|
450
|
-
jobId: job.id,
|
|
451
|
-
recordCount: mappingResult.data.length,
|
|
452
|
-
timestamp: new Date().toISOString()
|
|
453
|
-
});
|
|
454
|
-
|
|
455
|
-
return mappingResult.data.length;
|
|
456
|
-
}
|
|
457
|
-
|
|
458
|
-
/**
|
|
459
|
-
* Split records into batches
|
|
460
|
-
*/
|
|
461
|
-
private createBatches<T>(items: T[], batchSize: number): T[][] {
|
|
462
|
-
const batches: T[][] = [];
|
|
463
|
-
for (let i = 0; i < items.length; i += batchSize) {
|
|
464
|
-
batches.push(items.slice(i, i + batchSize));
|
|
465
|
-
}
|
|
466
|
-
return batches;
|
|
467
|
-
}
|
|
468
|
-
}
|
|
469
|
-
|
|
470
|
-
// Usage example
|
|
471
|
-
const kvAdapter = new VersoriKVAdapter(openKv(':project:'));
|
|
472
|
-
const processor = new ParallelIngestionProcessor(
|
|
473
|
-
client,
|
|
474
|
-
s3DataSource,
|
|
475
|
-
csvParser,
|
|
476
|
-
mapper,
|
|
477
|
-
stateService,
|
|
478
|
-
kvAdapter,
|
|
479
|
-
{
|
|
480
|
-
concurrency: 5,
|
|
481
|
-
batchSize: 2000,
|
|
482
|
-
maxRetries: 3,
|
|
483
|
-
retryDelayMs: 1000
|
|
484
|
-
}
|
|
485
|
-
);
|
|
486
|
-
|
|
487
|
-
const files = await s3DataSource.listFiles({ prefix: 'data/' });
|
|
488
|
-
const results = await processor.processFiles('bucket', files.map(f => f.path));
|
|
489
|
-
|
|
490
|
-
// Summary
|
|
491
|
-
const successful = results.filter(r => r.status === 'success').length;
|
|
492
|
-
const failed = results.filter(r => r.status === 'error').length;
|
|
493
|
-
const skipped = results.filter(r => r.status === 'skipped').length;
|
|
494
|
-
|
|
495
|
-
console.log(`Processing complete: ${successful} successful, ${failed} failed, ${skipped} skipped`);
|
|
496
|
-
```
|
|
497
|
-
|
|
498
|
-
### Rate Limiting
|
|
499
|
-
|
|
500
|
-
```typescript
|
|
501
|
-
class RateLimiter {
|
|
502
|
-
private tokens: number;
|
|
503
|
-
private lastRefill: number;
|
|
504
|
-
|
|
505
|
-
constructor(
|
|
506
|
-
private maxTokens: number,
|
|
507
|
-
private refillRate: number // tokens per second
|
|
508
|
-
) {
|
|
509
|
-
this.tokens = maxTokens;
|
|
510
|
-
this.lastRefill = Date.now();
|
|
511
|
-
}
|
|
512
|
-
|
|
513
|
-
async acquire(tokens: number = 1): Promise<void> {
|
|
514
|
-
await this.refillTokens();
|
|
515
|
-
|
|
516
|
-
while (this.tokens < tokens) {
|
|
517
|
-
await new Promise(resolve => setTimeout(resolve, 100));
|
|
518
|
-
await this.refillTokens();
|
|
519
|
-
}
|
|
520
|
-
|
|
521
|
-
this.tokens -= tokens;
|
|
522
|
-
}
|
|
523
|
-
|
|
524
|
-
private async refillTokens(): Promise<void> {
|
|
525
|
-
const now = Date.now();
|
|
526
|
-
const timePassed = (now - this.lastRefill) / 1000;
|
|
527
|
-
const tokensToAdd = timePassed * this.refillRate;
|
|
528
|
-
|
|
529
|
-
this.tokens = Math.min(this.maxTokens, this.tokens + tokensToAdd);
|
|
530
|
-
this.lastRefill = now;
|
|
531
|
-
}
|
|
532
|
-
}
|
|
533
|
-
|
|
534
|
-
// Usage
|
|
535
|
-
const rateLimiter = new RateLimiter(100, 10); // 100 max, 10 per second
|
|
536
|
-
|
|
537
|
-
async function rateLimitedProcessing(files: string[]) {
|
|
538
|
-
for (const file of files) {
|
|
539
|
-
await rateLimiter.acquire();
|
|
540
|
-
await processFile(file);
|
|
541
|
-
}
|
|
542
|
-
}
|
|
543
|
-
```
|
|
544
|
-
|
|
545
|
-
## Job Strategy Comparison
|
|
546
|
-
|
|
547
|
-
### DAILY Strategy
|
|
548
|
-
|
|
549
|
-
**When to use:** Frequent, small updates throughout the day
|
|
550
|
-
|
|
551
|
-
```typescript
|
|
552
|
-
async function dailyJobStrategy(client: FluentClient) {
|
|
553
|
-
const dateKey = new Date().toISOString().split('T')[0];
|
|
554
|
-
const jobName = \`Daily Inventory - \${dateKey}\`;
|
|
555
|
-
|
|
556
|
-
// Reuse same job for all files processed today
|
|
557
|
-
let jobId = await findOrCreateDailyJob(client, dateKey);
|
|
558
|
-
|
|
559
|
-
const files = await s3.listFiles({ prefix: 'data/' });
|
|
560
|
-
|
|
561
|
-
for (const file of files) {
|
|
562
|
-
// All files use the same job
|
|
563
|
-
await client.sendBatch(jobId, {
|
|
564
|
-
action: 'UPSERT',
|
|
565
|
-
entityType: 'INVENTORY',
|
|
566
|
-
entities: fileData
|
|
567
|
-
});
|
|
568
|
-
}
|
|
569
|
-
}
|
|
570
|
-
|
|
571
|
-
async function findOrCreateDailyJob(
|
|
572
|
-
client: FluentClient,
|
|
573
|
-
dateKey: string
|
|
574
|
-
): Promise<string> {
|
|
575
|
-
// Check state for today's job
|
|
576
|
-
const existingJobId = await state.get(\`daily-job:\${dateKey}\`);
|
|
577
|
-
|
|
578
|
-
if (existingJobId) {
|
|
579
|
-
return existingJobId;
|
|
580
|
-
}
|
|
581
|
-
|
|
582
|
-
// Create new daily job
|
|
583
|
-
const job = await client.createJob({
|
|
584
|
-
name: \`Daily Inventory - \${dateKey}\`,
|
|
585
|
-
retailerId: process.env.FLUENT_RETAILER_ID!
|
|
586
|
-
});
|
|
587
|
-
|
|
588
|
-
// Cache for today
|
|
589
|
-
await state.set(\`daily-job:\${dateKey}\`, job.id, 24 * 60 * 60 * 1000);
|
|
590
|
-
|
|
591
|
-
return job.id;
|
|
592
|
-
}
|
|
593
|
-
```
|
|
594
|
-
|
|
595
|
-
**Pros:**
|
|
596
|
-
- ✅ Fewer jobs to track
|
|
597
|
-
- ✅ Consolidated daily batches
|
|
598
|
-
- ✅ Easier monitoring
|
|
599
|
-
|
|
600
|
-
**Cons:**
|
|
601
|
-
- ❌ All files succeed/fail together
|
|
602
|
-
- ❌ Harder to track individual file status
|
|
603
|
-
|
|
604
|
-
### PER_FILE Strategy
|
|
605
|
-
|
|
606
|
-
**When to use:** Large files, independent processing, failure isolation
|
|
607
|
-
|
|
608
|
-
```typescript
|
|
609
|
-
async function perFileJobStrategy(client: FluentClient) {
|
|
610
|
-
const files = await s3.listFiles({ prefix: 'data/' });
|
|
611
|
-
|
|
612
|
-
for (const file of files) {
|
|
613
|
-
// Create dedicated job for each file
|
|
614
|
-
const job = await client.createJob({
|
|
615
|
-
name: \`Import - \${file.path}\`,
|
|
616
|
-
retailerId: process.env.FLUENT_RETAILER_ID!,
|
|
617
|
-
metadata: {
|
|
618
|
-
fileName: file.path,
|
|
619
|
-
fileSize: file.size
|
|
620
|
-
}
|
|
621
|
-
});
|
|
622
|
-
|
|
623
|
-
await client.sendBatch(job.id, {
|
|
624
|
-
action: 'UPSERT',
|
|
625
|
-
entityType: 'INVENTORY',
|
|
626
|
-
entities: fileData
|
|
627
|
-
});
|
|
628
|
-
|
|
629
|
-
// Track job per file
|
|
630
|
-
await state.markFileProcessed(file.path, {
|
|
631
|
-
jobId: job.id,
|
|
632
|
-
status: 'success'
|
|
633
|
-
});
|
|
634
|
-
}
|
|
635
|
-
}
|
|
636
|
-
```
|
|
637
|
-
|
|
638
|
-
**Pros:**
|
|
639
|
-
- ✅ Independent file processing
|
|
640
|
-
- ✅ Fine-grained error tracking
|
|
641
|
-
- ✅ Easy retry of failed files
|
|
642
|
-
|
|
643
|
-
**Cons:**
|
|
644
|
-
- ❌ More jobs to manage
|
|
645
|
-
- ❌ More API calls
|
|
646
|
-
|
|
647
|
-
## Memory Optimization
|
|
648
|
-
|
|
649
|
-
### Streaming Large Files
|
|
650
|
-
|
|
651
|
-
```typescript
|
|
652
|
-
async function streamLargeFile(filePath: string) {
|
|
653
|
-
const s3 = new S3DataSource(
|
|
654
|
-
{
|
|
655
|
-
type: 'S3_CSV',
|
|
656
|
-
connectionId: 's3-stream',
|
|
657
|
-
name: 'S3 Stream',
|
|
658
|
-
s3Config: s3Config,
|
|
659
|
-
},
|
|
660
|
-
logger
|
|
661
|
-
);
|
|
662
|
-
const parser = new CSVParserService();
|
|
663
|
-
|
|
664
|
-
// Stream file from S3 (no full file in memory)
|
|
665
|
-
const data = await s3.downloadFile(filePath);
|
|
666
|
-
const recordIterator = parser.parseStreaming(data);
|
|
667
|
-
|
|
668
|
-
let batch = [];
|
|
669
|
-
const BATCH_SIZE = 1000;
|
|
670
|
-
|
|
671
|
-
for await (const record of recordIterator) {
|
|
672
|
-
batch.push(record);
|
|
673
|
-
|
|
674
|
-
if (batch.length >= BATCH_SIZE) {
|
|
675
|
-
await processBatch(batch);
|
|
676
|
-
batch = []; // Free memory
|
|
677
|
-
}
|
|
678
|
-
}
|
|
679
|
-
|
|
680
|
-
// Process remaining records
|
|
681
|
-
if (batch.length > 0) {
|
|
682
|
-
await processBatch(batch);
|
|
683
|
-
}
|
|
684
|
-
}
|
|
685
|
-
```
|
|
686
|
-
|
|
687
|
-
### Chunked Processing for Memory Efficiency
|
|
688
|
-
|
|
689
|
-
Process large files in chunks to prevent memory exhaustion:
|
|
690
|
-
|
|
691
|
-
```typescript
|
|
692
|
-
import {
|
|
693
|
-
createClient,
|
|
694
|
-
S3DataSource,
|
|
695
|
-
CSVParserService,
|
|
696
|
-
UniversalMapper,
|
|
697
|
-
FluentClient
|
|
698
|
-
} from '@fluentcommerce/fc-connect-sdk';
|
|
699
|
-
|
|
700
|
-
interface ChunkProcessingConfig {
|
|
701
|
-
chunkSize: number; // Records per chunk
|
|
702
|
-
batchSize: number; // Records per batch API call
|
|
703
|
-
enableStreaming: boolean;
|
|
704
|
-
}
|
|
705
|
-
|
|
706
|
-
async function processLargeFileInChunks(
|
|
707
|
-
client: FluentClient,
|
|
708
|
-
s3: S3DataSource,
|
|
709
|
-
bucket: string,
|
|
710
|
-
fileKey: string,
|
|
711
|
-
mapper: UniversalMapper,
|
|
712
|
-
config: ChunkProcessingConfig
|
|
713
|
-
): Promise<void> {
|
|
714
|
-
console.log(`Processing ${fileKey} with streaming: ${config.enableStreaming}`);
|
|
715
|
-
|
|
716
|
-
let totalProcessed = 0;
|
|
717
|
-
let currentChunk: any[] = [];
|
|
718
|
-
|
|
719
|
-
if (config.enableStreaming) {
|
|
720
|
-
// Streaming approach - memory efficient
|
|
721
|
-
const data = await s3.downloadFile(filePath);
|
|
722
|
-
const parser = new CSVParserService();
|
|
723
|
-
const recordIterator = parser.parseStreaming(data);
|
|
724
|
-
|
|
725
|
-
for await (const record of recordIterator) {
|
|
726
|
-
currentChunk.push(record);
|
|
727
|
-
|
|
728
|
-
if (currentChunk.length >= config.chunkSize) {
|
|
729
|
-
const processed = await processChunk(
|
|
730
|
-
client,
|
|
731
|
-
currentChunk,
|
|
732
|
-
mapper,
|
|
733
|
-
config.batchSize,
|
|
734
|
-
fileKey
|
|
735
|
-
);
|
|
736
|
-
totalProcessed += processed;
|
|
737
|
-
|
|
738
|
-
// Clear chunk from memory
|
|
739
|
-
currentChunk = [];
|
|
740
|
-
|
|
741
|
-
// Log memory usage
|
|
742
|
-
logMemoryUsage(`After chunk (${totalProcessed} records)`);
|
|
743
|
-
}
|
|
744
|
-
}
|
|
745
|
-
} else {
|
|
746
|
-
// Non-streaming approach - load entire file
|
|
747
|
-
const fileContent = await s3.downloadFile(filePath);
|
|
748
|
-
const parser = new CSVParserService();
|
|
749
|
-
const allRecords = await parser.parse(fileContent);
|
|
750
|
-
|
|
751
|
-
// Process in chunks
|
|
752
|
-
for (let i = 0; i < allRecords.length; i += config.chunkSize) {
|
|
753
|
-
const chunk = allRecords.slice(i, i + config.chunkSize);
|
|
754
|
-
const processed = await processChunk(
|
|
755
|
-
client,
|
|
756
|
-
chunk,
|
|
757
|
-
mapper,
|
|
758
|
-
config.batchSize,
|
|
759
|
-
fileKey
|
|
760
|
-
);
|
|
761
|
-
totalProcessed += processed;
|
|
762
|
-
|
|
763
|
-
logMemoryUsage(`After chunk (${totalProcessed} records)`);
|
|
764
|
-
}
|
|
765
|
-
}
|
|
766
|
-
|
|
767
|
-
// Process remaining records
|
|
768
|
-
if (currentChunk.length > 0) {
|
|
769
|
-
const processed = await processChunk(
|
|
770
|
-
client,
|
|
771
|
-
currentChunk,
|
|
772
|
-
mapper,
|
|
773
|
-
config.batchSize,
|
|
774
|
-
fileKey
|
|
775
|
-
);
|
|
776
|
-
totalProcessed += processed;
|
|
777
|
-
}
|
|
778
|
-
|
|
779
|
-
console.log(`Completed ${fileKey}: ${totalProcessed} records processed`);
|
|
780
|
-
}
|
|
781
|
-
|
|
782
|
-
async function processChunk(
|
|
783
|
-
client: FluentClient,
|
|
784
|
-
chunk: any[],
|
|
785
|
-
mapper: UniversalMapper,
|
|
786
|
-
batchSize: number,
|
|
787
|
-
fileKey: string
|
|
788
|
-
): Promise<number> {
|
|
789
|
-
// Map fields
|
|
790
|
-
const mappingResult = await mapper.map(chunk);
|
|
791
|
-
|
|
792
|
-
if (!mappingResult.success) {
|
|
793
|
-
console.error(`Mapping errors in ${fileKey}:`, mappingResult.errors);
|
|
794
|
-
return 0;
|
|
795
|
-
}
|
|
796
|
-
|
|
797
|
-
// Create job for this chunk
|
|
798
|
-
const job = await client.createJob({
|
|
799
|
-
name: `Import Chunk - ${fileKey} - ${new Date().toISOString()}`,
|
|
800
|
-
retailerId: process.env.FLUENT_RETAILER_ID!
|
|
801
|
-
});
|
|
802
|
-
|
|
803
|
-
// Send in batches
|
|
804
|
-
const batches = createBatches(mappingResult.data, batchSize);
|
|
805
|
-
|
|
806
|
-
for (const batch of batches) {
|
|
807
|
-
await client.sendBatch(job.id, {
|
|
808
|
-
action: 'UPSERT',
|
|
809
|
-
entityType: 'INVENTORY',
|
|
810
|
-
entities: batch
|
|
811
|
-
});
|
|
812
|
-
}
|
|
813
|
-
|
|
814
|
-
return mappingResult.data.length;
|
|
815
|
-
}
|
|
816
|
-
|
|
817
|
-
function createBatches<T>(items: T[], batchSize: number): T[][] {
|
|
818
|
-
const batches: T[][] = [];
|
|
819
|
-
for (let i = 0; i < items.length; i += batchSize) {
|
|
820
|
-
batches.push(items.slice(i, i + batchSize));
|
|
821
|
-
}
|
|
822
|
-
return batches;
|
|
823
|
-
}
|
|
824
|
-
|
|
825
|
-
// Usage
|
|
826
|
-
await processLargeFileInChunks(
|
|
827
|
-
client,
|
|
828
|
-
s3DataSource,
|
|
829
|
-
'inventory-bucket',
|
|
830
|
-
'large-file-500k-records.csv',
|
|
831
|
-
mapper,
|
|
832
|
-
{
|
|
833
|
-
chunkSize: 5000, // Process 5K records at a time
|
|
834
|
-
batchSize: 2000, // Send 2K records per batch
|
|
835
|
-
enableStreaming: true
|
|
836
|
-
}
|
|
837
|
-
);
|
|
838
|
-
```
|
|
839
|
-
|
|
840
|
-
### Memory Monitoring
|
|
841
|
-
|
|
842
|
-
```typescript
|
|
843
|
-
function logMemoryUsage(label: string) {
|
|
844
|
-
const used = process.memoryUsage();
|
|
845
|
-
console.log(\`[\${label}] Memory Usage:\`);
|
|
846
|
-
console.log(\` RSS: \${Math.round(used.rss / 1024 / 1024)} MB\`);
|
|
847
|
-
console.log(\` Heap Total: \${Math.round(used.heapTotal / 1024 / 1024)} MB\`);
|
|
848
|
-
console.log(\` Heap Used: \${Math.round(used.heapUsed / 1024 / 1024)} MB\`);
|
|
849
|
-
}
|
|
850
|
-
|
|
851
|
-
async function memoryEfficientProcessing() {
|
|
852
|
-
logMemoryUsage('Start');
|
|
853
|
-
|
|
854
|
-
const files = await listFiles();
|
|
855
|
-
logMemoryUsage('After listing files');
|
|
856
|
-
|
|
857
|
-
for (const file of files) {
|
|
858
|
-
await streamLargeFile(file);
|
|
859
|
-
logMemoryUsage(\`After processing \${file}\`);
|
|
860
|
-
|
|
861
|
-
// Force garbage collection if available
|
|
862
|
-
if (global.gc) {
|
|
863
|
-
global.gc();
|
|
864
|
-
}
|
|
865
|
-
}
|
|
866
|
-
|
|
867
|
-
logMemoryUsage('End');
|
|
868
|
-
}
|
|
869
|
-
```
|
|
870
|
-
|
|
871
|
-
## Performance Monitoring
|
|
872
|
-
|
|
873
|
-
### Metrics Collection
|
|
874
|
-
|
|
875
|
-
```typescript
|
|
876
|
-
interface PerformanceMetrics {
|
|
877
|
-
filesProcessed: number;
|
|
878
|
-
recordsProcessed: number;
|
|
879
|
-
totalTimeMs: number;
|
|
880
|
-
avgTimePerFile: number;
|
|
881
|
-
avgRecordsPerSecond: number;
|
|
882
|
-
batchesSent: number;
|
|
883
|
-
errors: number;
|
|
884
|
-
}
|
|
885
|
-
|
|
886
|
-
async function trackPerformance(
|
|
887
|
-
operation: () => Promise<void>
|
|
888
|
-
): Promise<PerformanceMetrics> {
|
|
889
|
-
const startTime = Date.now();
|
|
890
|
-
const metrics: PerformanceMetrics = {
|
|
891
|
-
filesProcessed: 0,
|
|
892
|
-
recordsProcessed: 0,
|
|
893
|
-
totalTimeMs: 0,
|
|
894
|
-
avgTimePerFile: 0,
|
|
895
|
-
avgRecordsPerSecond: 0,
|
|
896
|
-
batchesSent: 0,
|
|
897
|
-
errors: 0
|
|
898
|
-
};
|
|
899
|
-
|
|
900
|
-
await operation();
|
|
901
|
-
|
|
902
|
-
metrics.totalTimeMs = Date.now() - startTime;
|
|
903
|
-
metrics.avgTimePerFile = metrics.totalTimeMs / metrics.filesProcessed;
|
|
904
|
-
metrics.avgRecordsPerSecond =
|
|
905
|
-
(metrics.recordsProcessed / metrics.totalTimeMs) * 1000;
|
|
906
|
-
|
|
907
|
-
return metrics;
|
|
908
|
-
}
|
|
909
|
-
```
|
|
910
|
-
|
|
911
|
-
## Complete Optimized Production Example
|
|
912
|
-
|
|
913
|
-
Putting it all together: adaptive sizing, parallel processing, streaming, and monitoring:
|
|
914
|
-
|
|
915
|
-
```typescript
|
|
916
|
-
import {
|
|
917
|
-
createClient,
|
|
918
|
-
S3DataSource,
|
|
919
|
-
CSVParserService,
|
|
920
|
-
UniversalMapper,
|
|
921
|
-
StateService,
|
|
922
|
-
VersoriKVAdapter,
|
|
923
|
-
FluentClient
|
|
924
|
-
} from '@fluentcommerce/fc-connect-sdk';
|
|
925
|
-
|
|
926
|
-
/**
|
|
927
|
-
* Production-grade optimized ingestion pipeline
|
|
928
|
-
* Combines all performance optimization techniques
|
|
929
|
-
*/
|
|
930
|
-
class OptimizedIngestionPipeline {
|
|
931
|
-
private client: FluentClient;
|
|
932
|
-
private s3: S3DataSource;
|
|
933
|
-
private parser: CSVParserService;
|
|
934
|
-
private mapper: UniversalMapper;
|
|
935
|
-
private state: StateService;
|
|
936
|
-
private kv: KVStore;
|
|
937
|
-
private adaptiveProcessor: AdaptiveBatchProcessor;
|
|
938
|
-
private metrics: PerformanceMetrics;
|
|
939
|
-
|
|
940
|
-
constructor(
|
|
941
|
-
private config: {
|
|
942
|
-
fluent: any;
|
|
943
|
-
s3: any;
|
|
944
|
-
mapping: any;
|
|
945
|
-
kv: any;
|
|
946
|
-
concurrency: number;
|
|
947
|
-
enableStreaming: boolean;
|
|
948
|
-
enableAdaptiveSizing: boolean;
|
|
949
|
-
}
|
|
950
|
-
) {}
|
|
951
|
-
|
|
952
|
-
/**
|
|
953
|
-
* Initialize all components
|
|
954
|
-
*/
|
|
955
|
-
async initialize(): Promise<void> {
|
|
956
|
-
// Create Fluent client
|
|
957
|
-
this.client = await createClient({ config: this.config.fluent });
|
|
958
|
-
|
|
959
|
-
// Initialize logger
|
|
960
|
-
const logger = toStructuredLogger(createConsoleLogger(), {
|
|
961
|
-
service: 'optimized-ingestion',
|
|
962
|
-
correlationId: generateCorrelationId()
|
|
963
|
-
});
|
|
964
|
-
|
|
965
|
-
// Initialize data sources
|
|
966
|
-
this.s3 = new S3DataSource(
|
|
967
|
-
{
|
|
968
|
-
type: 'S3_CSV',
|
|
969
|
-
connectionId: 's3-optimized',
|
|
970
|
-
name: 'S3 Optimized',
|
|
971
|
-
s3Config: this.config.s3,
|
|
972
|
-
},
|
|
973
|
-
logger
|
|
974
|
-
);
|
|
975
|
-
this.parser = new CSVParserService(logger);
|
|
976
|
-
|
|
977
|
-
// Initialize field mapper
|
|
978
|
-
this.mapper = new UniversalMapper(this.config.mapping);
|
|
979
|
-
|
|
980
|
-
// Initialize state management
|
|
981
|
-
this.kv = new VersoriKVAdapter(this.config.kv);
|
|
982
|
-
this.stateService = new StateService(logger);
|
|
983
|
-
|
|
984
|
-
// Initialize adaptive processor
|
|
985
|
-
if (this.config.enableAdaptiveSizing) {
|
|
986
|
-
this.adaptiveProcessor = new AdaptiveBatchProcessor({
|
|
987
|
-
minBatchSize: 500,
|
|
988
|
-
maxBatchSize: 5000,
|
|
989
|
-
targetProcessingTimeMs: 3000,
|
|
990
|
-
adjustmentFactor: 0.2
|
|
991
|
-
});
|
|
992
|
-
}
|
|
993
|
-
|
|
994
|
-
// Initialize metrics
|
|
995
|
-
this.metrics = {
|
|
996
|
-
filesProcessed: 0,
|
|
997
|
-
recordsProcessed: 0,
|
|
998
|
-
totalTimeMs: 0,
|
|
999
|
-
avgTimePerFile: 0,
|
|
1000
|
-
avgRecordsPerSecond: 0,
|
|
1001
|
-
batchesSent: 0,
|
|
1002
|
-
errors: 0
|
|
1003
|
-
};
|
|
1004
|
-
|
|
1005
|
-
console.log('Optimized ingestion pipeline initialized');
|
|
1006
|
-
}
|
|
1007
|
-
|
|
1008
|
-
/**
|
|
1009
|
-
* Run the complete ingestion pipeline
|
|
1010
|
-
*/
|
|
1011
|
-
async run(bucket: string, prefix: string): Promise<PerformanceMetrics> {
|
|
1012
|
-
const startTime = Date.now();
|
|
1013
|
-
|
|
1014
|
-
try {
|
|
1015
|
-
// List all files
|
|
1016
|
-
console.log(`Listing files in s3://${bucket}/${prefix}`);
|
|
1017
|
-
const allFiles = await this.s3.listFiles({ prefix });
|
|
1018
|
-
console.log(`Found ${allFiles.length} files`);
|
|
1019
|
-
|
|
1020
|
-
// Filter unprocessed files
|
|
1021
|
-
const unprocessedFiles = [];
|
|
1022
|
-
for (const file of allFiles) {
|
|
1023
|
-
if (!(await this.state.isFileProcessed(this.kv, file.path))) {
|
|
1024
|
-
unprocessedFiles.push(file);
|
|
1025
|
-
}
|
|
1026
|
-
}
|
|
1027
|
-
|
|
1028
|
-
console.log(
|
|
1029
|
-
`${unprocessedFiles.length} files to process (${allFiles.length - unprocessedFiles.length} already processed)`
|
|
1030
|
-
);
|
|
1031
|
-
|
|
1032
|
-
if (unprocessedFiles.length === 0) {
|
|
1033
|
-
console.log('No files to process');
|
|
1034
|
-
return this.metrics;
|
|
1035
|
-
}
|
|
1036
|
-
|
|
1037
|
-
// Process files in parallel
|
|
1038
|
-
await this.processFilesInParallel(bucket, unprocessedFiles);
|
|
1039
|
-
|
|
1040
|
-
// Calculate metrics
|
|
1041
|
-
this.metrics.totalTimeMs = Date.now() - startTime;
|
|
1042
|
-
this.metrics.avgTimePerFile = this.metrics.totalTimeMs / this.metrics.filesProcessed;
|
|
1043
|
-
this.metrics.avgRecordsPerSecond =
|
|
1044
|
-
(this.metrics.recordsProcessed / this.metrics.totalTimeMs) * 1000;
|
|
1045
|
-
|
|
1046
|
-
// Log summary
|
|
1047
|
-
this.logSummary();
|
|
1048
|
-
|
|
1049
|
-
return this.metrics;
|
|
1050
|
-
|
|
1051
|
-
} catch (error) {
|
|
1052
|
-
console.error('Pipeline failed:', error);
|
|
1053
|
-
throw error;
|
|
1054
|
-
}
|
|
1055
|
-
}
|
|
1056
|
-
|
|
1057
|
-
/**
|
|
1058
|
-
* Process multiple files in parallel
|
|
1059
|
-
*/
|
|
1060
|
-
private async processFilesInParallel(
|
|
1061
|
-
bucket: string,
|
|
1062
|
-
files: Array<{ key: string; size: number }>
|
|
1063
|
-
): Promise<void> {
|
|
1064
|
-
const queue = [...files];
|
|
1065
|
-
const concurrency = this.config.concurrency;
|
|
1066
|
-
|
|
1067
|
-
console.log(`Processing ${files.length} files with ${concurrency} workers`);
|
|
1068
|
-
|
|
1069
|
-
// Create worker pool
|
|
1070
|
-
const workers = Array(concurrency)
|
|
1071
|
-
.fill(null)
|
|
1072
|
-
.map(async (_, workerId) => {
|
|
1073
|
-
while (queue.length > 0) {
|
|
1074
|
-
const file = queue.shift();
|
|
1075
|
-
if (!file) break;
|
|
1076
|
-
|
|
1077
|
-
console.log(
|
|
1078
|
-
`Worker ${workerId}: Processing ${file.path} (${queue.length} remaining)`
|
|
1079
|
-
);
|
|
1080
|
-
|
|
1081
|
-
try {
|
|
1082
|
-
await this.processFile(bucket, file.path);
|
|
1083
|
-
this.metrics.filesProcessed++;
|
|
1084
|
-
} catch (error) {
|
|
1085
|
-
console.error(`Worker ${workerId}: Failed to process ${file.path}:`, error);
|
|
1086
|
-
this.metrics.errors++;
|
|
1087
|
-
}
|
|
1088
|
-
}
|
|
1089
|
-
});
|
|
1090
|
-
|
|
1091
|
-
await Promise.all(workers);
|
|
1092
|
-
}
|
|
1093
|
-
|
|
1094
|
-
/**
|
|
1095
|
-
* Process a single file with all optimizations
|
|
1096
|
-
*/
|
|
1097
|
-
private async processFile(bucket: string, fileKey: string): Promise<void> {
|
|
1098
|
-
const fileStartTime = Date.now();
|
|
1099
|
-
|
|
1100
|
-
// Create job for this file
|
|
1101
|
-
const job = await this.client.createJob({
|
|
1102
|
-
name: `Optimized Import - ${fileKey}`,
|
|
1103
|
-
retailerId: this.config.fluent.retailerId,
|
|
1104
|
-
metadata: {
|
|
1105
|
-
fileName: fileKey,
|
|
1106
|
-
pipeline: 'optimized'
|
|
1107
|
-
}
|
|
1108
|
-
});
|
|
1109
|
-
|
|
1110
|
-
let recordsInFile = 0;
|
|
1111
|
-
|
|
1112
|
-
if (this.config.enableStreaming) {
|
|
1113
|
-
// Streaming approach - memory efficient
|
|
1114
|
-
recordsInFile = await this.processFileStreaming(bucket, fileKey, job.id);
|
|
1115
|
-
} else {
|
|
1116
|
-
// Standard approach - load entire file
|
|
1117
|
-
recordsInFile = await this.processFileStandard(bucket, fileKey, job.id);
|
|
1118
|
-
}
|
|
1119
|
-
|
|
1120
|
-
// Mark file as processed
|
|
1121
|
-
await this.state.markFileProcessed(fileKey, {
|
|
1122
|
-
jobId: job.id,
|
|
1123
|
-
recordCount: recordsInFile,
|
|
1124
|
-
processingTimeMs: Date.now() - fileStartTime,
|
|
1125
|
-
timestamp: new Date().toISOString()
|
|
1126
|
-
});
|
|
1127
|
-
|
|
1128
|
-
this.metrics.recordsProcessed += recordsInFile;
|
|
1129
|
-
|
|
1130
|
-
console.log(
|
|
1131
|
-
`Completed ${fileKey}: ${recordsInFile} records in ${Date.now() - fileStartTime}ms`
|
|
1132
|
-
);
|
|
1133
|
-
}
|
|
1134
|
-
|
|
1135
|
-
/**
|
|
1136
|
-
* Process file using streaming (memory efficient)
|
|
1137
|
-
*/
|
|
1138
|
-
private async processFileStreaming(
|
|
1139
|
-
bucket: string,
|
|
1140
|
-
fileKey: string,
|
|
1141
|
-
jobId: string
|
|
1142
|
-
): Promise<number> {
|
|
1143
|
-
const stream = await this.s3.streamObject(bucket, fileKey);
|
|
1144
|
-
const recordIterator = this.parser.parseStream(stream);
|
|
1145
|
-
|
|
1146
|
-
let batch: any[] = [];
|
|
1147
|
-
let totalRecords = 0;
|
|
1148
|
-
const batchSize = this.config.enableAdaptiveSizing
|
|
1149
|
-
? this.adaptiveProcessor.getCurrentBatchSize()
|
|
1150
|
-
: 2000;
|
|
1151
|
-
|
|
1152
|
-
for await (const record of recordIterator) {
|
|
1153
|
-
batch.push(record);
|
|
1154
|
-
|
|
1155
|
-
if (batch.length >= batchSize) {
|
|
1156
|
-
const mappingResult = await this.mapper.map(batch);
|
|
1157
|
-
|
|
1158
|
-
if (mappingResult.success && mappingResult.data.length > 0) {
|
|
1159
|
-
await this.client.sendBatch(jobId, {
|
|
1160
|
-
action: 'UPSERT',
|
|
1161
|
-
entityType: 'INVENTORY',
|
|
1162
|
-
entities: mappingResult.data
|
|
1163
|
-
});
|
|
1164
|
-
|
|
1165
|
-
this.metrics.batchesSent++;
|
|
1166
|
-
totalRecords += mappingResult.data.length;
|
|
1167
|
-
}
|
|
1168
|
-
|
|
1169
|
-
batch = []; // Free memory
|
|
1170
|
-
logMemoryUsage(`After batch (${totalRecords} records)`);
|
|
1171
|
-
}
|
|
1172
|
-
}
|
|
1173
|
-
|
|
1174
|
-
// Process remaining records
|
|
1175
|
-
if (batch.length > 0) {
|
|
1176
|
-
const mappingResult = await this.mapper.map(batch);
|
|
1177
|
-
|
|
1178
|
-
if (mappingResult.success && mappingResult.data.length > 0) {
|
|
1179
|
-
await this.client.sendBatch(jobId, {
|
|
1180
|
-
action: 'UPSERT',
|
|
1181
|
-
entityType: 'INVENTORY',
|
|
1182
|
-
entities: mappingResult.data
|
|
1183
|
-
});
|
|
1184
|
-
|
|
1185
|
-
this.metrics.batchesSent++;
|
|
1186
|
-
totalRecords += mappingResult.data.length;
|
|
1187
|
-
}
|
|
1188
|
-
}
|
|
1189
|
-
|
|
1190
|
-
return totalRecords;
|
|
1191
|
-
}
|
|
1192
|
-
|
|
1193
|
-
/**
|
|
1194
|
-
* Process file using standard approach (adaptive sizing)
|
|
1195
|
-
*/
|
|
1196
|
-
private async processFileStandard(
|
|
1197
|
-
bucket: string,
|
|
1198
|
-
fileKey: string,
|
|
1199
|
-
jobId: string
|
|
1200
|
-
): Promise<number> {
|
|
1201
|
-
// Read entire file
|
|
1202
|
-
const fileContent = await this.s3.downloadFile(fileKey);
|
|
1203
|
-
const records = await this.parser.parse(fileContent);
|
|
1204
|
-
|
|
1205
|
-
// Map all records
|
|
1206
|
-
const mappingResult = await this.mapper.map(records);
|
|
1207
|
-
|
|
1208
|
-
if (!mappingResult.success || mappingResult.data.length === 0) {
|
|
1209
|
-
throw new Error(`Field mapping failed: ${mappingResult.errors.join(', ')}`);
|
|
1210
|
-
}
|
|
1211
|
-
|
|
1212
|
-
// Use adaptive sizing or standard batching
|
|
1213
|
-
if (this.config.enableAdaptiveSizing) {
|
|
1214
|
-
await this.adaptiveProcessor.processWithAdaptiveSizing(
|
|
1215
|
-
this.client,
|
|
1216
|
-
jobId,
|
|
1217
|
-
mappingResult.data
|
|
1218
|
-
);
|
|
1219
|
-
this.metrics.batchesSent += Math.ceil(
|
|
1220
|
-
mappingResult.data.length / this.adaptiveProcessor.getCurrentBatchSize()
|
|
1221
|
-
);
|
|
1222
|
-
} else {
|
|
1223
|
-
const batches = this.createBatches(mappingResult.data, 2000);
|
|
1224
|
-
|
|
1225
|
-
for (const batch of batches) {
|
|
1226
|
-
await this.client.sendBatch(jobId, {
|
|
1227
|
-
action: 'UPSERT',
|
|
1228
|
-
entityType: 'INVENTORY',
|
|
1229
|
-
entities: batch
|
|
1230
|
-
});
|
|
1231
|
-
|
|
1232
|
-
this.metrics.batchesSent++;
|
|
1233
|
-
}
|
|
1234
|
-
}
|
|
1235
|
-
|
|
1236
|
-
return mappingResult.data.length;
|
|
1237
|
-
}
|
|
1238
|
-
|
|
1239
|
-
/**
|
|
1240
|
-
* Create batches from records
|
|
1241
|
-
*/
|
|
1242
|
-
private createBatches<T>(items: T[], batchSize: number): T[][] {
|
|
1243
|
-
const batches: T[][] = [];
|
|
1244
|
-
for (let i = 0; i < items.length; i += batchSize) {
|
|
1245
|
-
batches.push(items.slice(i, i + batchSize));
|
|
1246
|
-
}
|
|
1247
|
-
return batches;
|
|
1248
|
-
}
|
|
1249
|
-
|
|
1250
|
-
/**
|
|
1251
|
-
* Log performance summary
|
|
1252
|
-
*/
|
|
1253
|
-
private logSummary(): void {
|
|
1254
|
-
console.log('\n═══════════════════════════════════════════════');
|
|
1255
|
-
console.log(' INGESTION PERFORMANCE SUMMARY ');
|
|
1256
|
-
console.log('═══════════════════════════════════════════════');
|
|
1257
|
-
console.log(`Files Processed: ${this.metrics.filesProcessed}`);
|
|
1258
|
-
console.log(`Records Processed: ${this.metrics.recordsProcessed.toLocaleString()}`);
|
|
1259
|
-
console.log(`Batches Sent: ${this.metrics.batchesSent}`);
|
|
1260
|
-
console.log(`Errors: ${this.metrics.errors}`);
|
|
1261
|
-
console.log(`Total Time: ${(this.metrics.totalTimeMs / 1000).toFixed(2)}s`);
|
|
1262
|
-
console.log(`Avg Time Per File: ${(this.metrics.avgTimePerFile / 1000).toFixed(2)}s`);
|
|
1263
|
-
console.log(`Throughput: ${Math.round(this.metrics.avgRecordsPerSecond)} records/sec`);
|
|
1264
|
-
console.log(`Success Rate: ${((this.metrics.filesProcessed / (this.metrics.filesProcessed + this.metrics.errors)) * 100).toFixed(1)}%`);
|
|
1265
|
-
console.log('═══════════════════════════════════════════════\n');
|
|
1266
|
-
}
|
|
1267
|
-
}
|
|
1268
|
-
|
|
1269
|
-
// ===== Usage Example =====
|
|
1270
|
-
|
|
1271
|
-
async function main() {
|
|
1272
|
-
// Create optimized pipeline
|
|
1273
|
-
const pipeline = new OptimizedIngestionPipeline({
|
|
1274
|
-
fluent: {
|
|
1275
|
-
baseUrl: process.env.FLUENT_BASE_URL!,
|
|
1276
|
-
clientId: process.env.FLUENT_CLIENT_ID!,
|
|
1277
|
-
clientSecret: process.env.FLUENT_CLIENT_SECRET!,
|
|
1278
|
-
retailerId: process.env.FLUENT_RETAILER_ID!
|
|
1279
|
-
},
|
|
1280
|
-
s3: {
|
|
1281
|
-
region: process.env.AWS_REGION!,
|
|
1282
|
-
credentials: {
|
|
1283
|
-
accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
|
|
1284
|
-
secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!
|
|
1285
|
-
}
|
|
1286
|
-
},
|
|
1287
|
-
mapping: {
|
|
1288
|
-
fields: {
|
|
1289
|
-
ref: { source: 'sku', required: true },
|
|
1290
|
-
productRef: { source: 'product_id', required: true },
|
|
1291
|
-
locationRef: { source: 'warehouse_code', required: true },
|
|
1292
|
-
qty: { source: 'quantity', resolver: 'sdk.parseInt', required: true },
|
|
1293
|
-
type: { source: 'inventory_type', default: 'ON_HAND' },
|
|
1294
|
-
status: { source: 'status', default: 'AVAILABLE' }
|
|
1295
|
-
}
|
|
1296
|
-
},
|
|
1297
|
-
kv: openKv(), // Versori KV or custom implementation
|
|
1298
|
-
concurrency: 5, // 5 parallel workers
|
|
1299
|
-
enableStreaming: true, // Use streaming for memory efficiency
|
|
1300
|
-
enableAdaptiveSizing: true // Auto-adjust batch sizes
|
|
1301
|
-
});
|
|
1302
|
-
|
|
1303
|
-
// Initialize pipeline
|
|
1304
|
-
await pipeline.initialize();
|
|
1305
|
-
|
|
1306
|
-
// Run ingestion
|
|
1307
|
-
const metrics = await pipeline.run(
|
|
1308
|
-
'inventory-bucket',
|
|
1309
|
-
'data/inventory/'
|
|
1310
|
-
);
|
|
1311
|
-
|
|
1312
|
-
console.log('Pipeline completed successfully');
|
|
1313
|
-
process.exit(0);
|
|
1314
|
-
}
|
|
1315
|
-
|
|
1316
|
-
// Run if executed directly
|
|
1317
|
-
if (require.main === module) {
|
|
1318
|
-
main().catch(error => {
|
|
1319
|
-
console.error('Fatal error:', error);
|
|
1320
|
-
process.exit(1);
|
|
1321
|
-
});
|
|
1322
|
-
}
|
|
1323
|
-
```
|
|
1324
|
-
|
|
1325
|
-
**What This Example Demonstrates:**
|
|
1326
|
-
|
|
1327
|
-
- ✅ **Parallel processing** - 5 concurrent workers process files simultaneously
|
|
1328
|
-
- ✅ **Adaptive batch sizing** - Automatically adjusts batch size based on performance
|
|
1329
|
-
- ✅ **Streaming support** - Memory-efficient processing for large files
|
|
1330
|
-
- ✅ **State management** - Prevents duplicate file processing
|
|
1331
|
-
- ✅ **Comprehensive metrics** - Detailed performance tracking and reporting
|
|
1332
|
-
- ✅ **Error isolation** - One file failure doesn't stop the entire pipeline
|
|
1333
|
-
- ✅ **Production-ready** - Includes logging, error handling, and monitoring
|
|
1334
|
-
|
|
1335
|
-
## Key Takeaways
|
|
1336
|
-
|
|
1337
|
-
- 🎯 **Batch size** - 1000-2000 records is optimal for most use cases
|
|
1338
|
-
- 🎯 **Parallel processing** - Use 3-5 workers for best throughput
|
|
1339
|
-
- 🎯 **DAILY strategy** - Best for frequent small updates
|
|
1340
|
-
- 🎯 **PER_FILE strategy** - Best for large independent files
|
|
1341
|
-
- 🎯 **Streaming** - Essential for files > 100 MB
|
|
1342
|
-
|
|
1343
|
-
## Next Steps
|
|
1344
|
-
|
|
1345
|
-
Continue to [Module 9: Best Practices →](./02-core-guides-ingestion-09-best-practices.md) to learn about comprehensive error handling, monitoring strategies, security best practices, and production-ready deployment patterns.
|
|
1346
|
-
|
|
1347
|
-
---
|
|
1348
|
-
|
|
1349
|
-
[← Back to Ingestion Guide](../ingestion-readme.md) | [Previous: Module 7 - State Management](./02-core-guides-ingestion-07-state-management.md) | [Next: Module 9 - Best Practices →](./02-core-guides-ingestion-09-best-practices.md)
|
|
1
|
+
# Module 8: Performance Optimization
|
|
2
|
+
|
|
3
|
+
**Level:** Advanced
|
|
4
|
+
**Estimated Time:** 25 minutes
|
|
5
|
+
|
|
6
|
+
## Overview
|
|
7
|
+
|
|
8
|
+
This module teaches you how to optimize ingestion performance through batch sizing strategies, parallel processing, job strategy selection, and memory management.
|
|
9
|
+
|
|
10
|
+
## Learning Objectives
|
|
11
|
+
|
|
12
|
+
By the end of this module, you will:
|
|
13
|
+
- ✅ Choose optimal batch sizes for your data
|
|
14
|
+
- ✅ Implement parallel file processing with concurrency control
|
|
15
|
+
- ✅ Select the right job strategy (DAILY vs PER_FILE)
|
|
16
|
+
- ✅ Optimize memory usage with streaming
|
|
17
|
+
- ✅ Monitor and tune ingestion performance
|
|
18
|
+
|
|
19
|
+
## Batch Sizing Strategy
|
|
20
|
+
|
|
21
|
+
### Understanding Batch Sizes
|
|
22
|
+
|
|
23
|
+
| Batch Size | Use Case | Pros | Cons |
|
|
24
|
+
|------------|----------|------|------|
|
|
25
|
+
| **100-500** | Testing, small datasets | Fast feedback, easy debugging | More API calls, slower overall |
|
|
26
|
+
| **1000-2000** | Standard ingestion | Good balance, reasonable memory | - |
|
|
27
|
+
| **5000-10000** | Large datasets, performance | Fewer API calls, faster | Higher memory, longer retries |
|
|
28
|
+
|
|
29
|
+
### Performance Benchmarks by Batch Size
|
|
30
|
+
|
|
31
|
+
Real-world performance metrics from production ingestion workflows:
|
|
32
|
+
|
|
33
|
+
| Batch Size | Records/Second | Time for 10K Records | Memory Usage | Network Calls | Recommended For |
|
|
34
|
+
|------------|----------------|----------------------|--------------|---------------|-----------------|
|
|
35
|
+
| **100** | 150-200 | ~50-60s | 50-80 MB | 100 | Development, debugging |
|
|
36
|
+
| **500** | 400-500 | ~20-25s | 80-120 MB | 20 | Testing, small datasets |
|
|
37
|
+
| **1,000** | 800-1,000 | ~10-12s | 120-180 MB | 10 | Standard ingestion |
|
|
38
|
+
| **2,000** | 1,400-1,600 | ~6-7s | 180-250 MB | 5 | High-volume ingestion |
|
|
39
|
+
| **5,000** | 2,200-2,500 | ~4-5s | 300-450 MB | 2 | Large file processing |
|
|
40
|
+
| **10,000** | 2,800-3,200 | ~3-4s | 500-800 MB | 1 | Maximum performance |
|
|
41
|
+
|
|
42
|
+
**Key Insights:**
|
|
43
|
+
- **Optimal range:** 1,000-2,000 records balances speed, memory, and error recovery
|
|
44
|
+
- **Diminishing returns:** Beyond 5,000 records, performance gains are minimal
|
|
45
|
+
- **Memory considerations:** Each batch occupies ~0.08-0.1 MB per record in memory
|
|
46
|
+
- **Network efficiency:** Larger batches reduce API calls but increase retry cost on failure
|
|
47
|
+
|
|
48
|
+
### Calculating Optimal Batch Size
|
|
49
|
+
|
|
50
|
+
```typescript
|
|
51
|
+
function calculateOptimalBatchSize(
|
|
52
|
+
totalRecords: number,
|
|
53
|
+
avgRecordSizeBytes: number
|
|
54
|
+
): number {
|
|
55
|
+
const MAX_BATCH_SIZE = 10000;
|
|
56
|
+
const MAX_PAYLOAD_SIZE = 10 * 1024 * 1024; // 10 MB
|
|
57
|
+
const SAFETY_FACTOR = 0.8; // 80% of max
|
|
58
|
+
|
|
59
|
+
// Calculate records that fit in payload
|
|
60
|
+
const recordsPerPayload = Math.floor(
|
|
61
|
+
(MAX_PAYLOAD_SIZE * SAFETY_FACTOR) / avgRecordSizeBytes
|
|
62
|
+
);
|
|
63
|
+
|
|
64
|
+
// Return smaller of payload limit or max batch size
|
|
65
|
+
return Math.min(recordsPerPayload, MAX_BATCH_SIZE);
|
|
66
|
+
}
|
|
67
|
+
|
|
68
|
+
// Usage
|
|
69
|
+
const avgSize = JSON.stringify(sampleRecord).length;
|
|
70
|
+
const batchSize = calculateOptimalBatchSize(10000, avgSize);
|
|
71
|
+
console.log(\`Optimal batch size: \${batchSize}\`);
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
### Dynamic Batch Sizing
|
|
75
|
+
|
|
76
|
+
```typescript
|
|
77
|
+
async function dynamicBatchIngestion(
|
|
78
|
+
client: FluentClient,
|
|
79
|
+
records: any[]
|
|
80
|
+
) {
|
|
81
|
+
const batches = [];
|
|
82
|
+
let currentBatch = [];
|
|
83
|
+
let currentSize = 0;
|
|
84
|
+
const TARGET_SIZE = 5 * 1024 * 1024; // 5 MB target
|
|
85
|
+
|
|
86
|
+
for (const record of records) {
|
|
87
|
+
const recordSize = JSON.stringify(record).length;
|
|
88
|
+
|
|
89
|
+
// Start new batch if adding this record exceeds target
|
|
90
|
+
if (currentSize + recordSize > TARGET_SIZE && currentBatch.length > 0) {
|
|
91
|
+
batches.push([...currentBatch]);
|
|
92
|
+
currentBatch = [];
|
|
93
|
+
currentSize = 0;
|
|
94
|
+
}
|
|
95
|
+
|
|
96
|
+
currentBatch.push(record);
|
|
97
|
+
currentSize += recordSize;
|
|
98
|
+
}
|
|
99
|
+
|
|
100
|
+
// Add final batch
|
|
101
|
+
if (currentBatch.length > 0) {
|
|
102
|
+
batches.push(currentBatch);
|
|
103
|
+
}
|
|
104
|
+
|
|
105
|
+
console.log(\`Created \${batches.length} batches from \${records.length} records\`);
|
|
106
|
+
return batches;
|
|
107
|
+
}
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
### Adaptive Batch Sizing with Performance Feedback
|
|
111
|
+
|
|
112
|
+
Automatically adjust batch size based on processing performance:
|
|
113
|
+
|
|
114
|
+
```typescript
|
|
115
|
+
import { FluentClient } from '@fluentcommerce/fc-connect-sdk';
|
|
116
|
+
|
|
117
|
+
interface AdaptiveBatchConfig {
|
|
118
|
+
minBatchSize: number;
|
|
119
|
+
maxBatchSize: number;
|
|
120
|
+
targetProcessingTimeMs: number;
|
|
121
|
+
adjustmentFactor: number;
|
|
122
|
+
}
|
|
123
|
+
|
|
124
|
+
class AdaptiveBatchProcessor {
|
|
125
|
+
private currentBatchSize: number;
|
|
126
|
+
private processingHistory: number[] = [];
|
|
127
|
+
|
|
128
|
+
constructor(private config: AdaptiveBatchConfig) {
|
|
129
|
+
this.currentBatchSize = Math.floor(
|
|
130
|
+
(config.minBatchSize + config.maxBatchSize) / 2
|
|
131
|
+
);
|
|
132
|
+
}
|
|
133
|
+
|
|
134
|
+
/**
|
|
135
|
+
* Process records with adaptive batch sizing
|
|
136
|
+
*/
|
|
137
|
+
async processWithAdaptiveSizing(
|
|
138
|
+
client: FluentClient,
|
|
139
|
+
jobId: string,
|
|
140
|
+
records: any[]
|
|
141
|
+
): Promise<void> {
|
|
142
|
+
let processedCount = 0;
|
|
143
|
+
|
|
144
|
+
while (processedCount < records.length) {
|
|
145
|
+
const batch = records.slice(
|
|
146
|
+
processedCount,
|
|
147
|
+
processedCount + this.currentBatchSize
|
|
148
|
+
);
|
|
149
|
+
|
|
150
|
+
const startTime = Date.now();
|
|
151
|
+
|
|
152
|
+
try {
|
|
153
|
+
await client.sendBatch(jobId, {
|
|
154
|
+
action: 'UPSERT',
|
|
155
|
+
entityType: 'INVENTORY',
|
|
156
|
+
entities: batch
|
|
157
|
+
});
|
|
158
|
+
|
|
159
|
+
const processingTime = Date.now() - startTime;
|
|
160
|
+
this.adjustBatchSize(processingTime, batch.length);
|
|
161
|
+
|
|
162
|
+
processedCount += batch.length;
|
|
163
|
+
|
|
164
|
+
console.log(
|
|
165
|
+
`Processed ${batch.length} records in ${processingTime}ms. ` +
|
|
166
|
+
`New batch size: ${this.currentBatchSize}`
|
|
167
|
+
);
|
|
168
|
+
|
|
169
|
+
} catch (error) {
|
|
170
|
+
console.error(`Batch failed, reducing batch size`);
|
|
171
|
+
this.currentBatchSize = Math.max(
|
|
172
|
+
this.config.minBatchSize,
|
|
173
|
+
Math.floor(this.currentBatchSize * 0.5)
|
|
174
|
+
);
|
|
175
|
+
// Don't increment processedCount - retry this batch
|
|
176
|
+
}
|
|
177
|
+
}
|
|
178
|
+
}
|
|
179
|
+
|
|
180
|
+
/**
|
|
181
|
+
* Adjust batch size based on processing time
|
|
182
|
+
*/
|
|
183
|
+
private adjustBatchSize(processingTimeMs: number, recordsProcessed: number): void {
|
|
184
|
+
this.processingHistory.push(processingTimeMs);
|
|
185
|
+
|
|
186
|
+
// Keep last 10 measurements
|
|
187
|
+
if (this.processingHistory.length > 10) {
|
|
188
|
+
this.processingHistory.shift();
|
|
189
|
+
}
|
|
190
|
+
|
|
191
|
+
const avgProcessingTime =
|
|
192
|
+
this.processingHistory.reduce((a, b) => a + b, 0) /
|
|
193
|
+
this.processingHistory.length;
|
|
194
|
+
|
|
195
|
+
// If processing is faster than target, increase batch size
|
|
196
|
+
if (avgProcessingTime < this.config.targetProcessingTimeMs * 0.8) {
|
|
197
|
+
this.currentBatchSize = Math.min(
|
|
198
|
+
this.config.maxBatchSize,
|
|
199
|
+
Math.floor(this.currentBatchSize * (1 + this.config.adjustmentFactor))
|
|
200
|
+
);
|
|
201
|
+
}
|
|
202
|
+
// If processing is slower than target, decrease batch size
|
|
203
|
+
else if (avgProcessingTime > this.config.targetProcessingTimeMs * 1.2) {
|
|
204
|
+
this.currentBatchSize = Math.max(
|
|
205
|
+
this.config.minBatchSize,
|
|
206
|
+
Math.floor(this.currentBatchSize * (1 - this.config.adjustmentFactor))
|
|
207
|
+
);
|
|
208
|
+
}
|
|
209
|
+
// Otherwise maintain current size
|
|
210
|
+
}
|
|
211
|
+
|
|
212
|
+
/**
|
|
213
|
+
* Get current batch size
|
|
214
|
+
*/
|
|
215
|
+
getCurrentBatchSize(): number {
|
|
216
|
+
return this.currentBatchSize;
|
|
217
|
+
}
|
|
218
|
+
}
|
|
219
|
+
|
|
220
|
+
// Usage example
|
|
221
|
+
const adaptiveProcessor = new AdaptiveBatchProcessor({
|
|
222
|
+
minBatchSize: 500,
|
|
223
|
+
maxBatchSize: 5000,
|
|
224
|
+
targetProcessingTimeMs: 3000, // Target 3 seconds per batch
|
|
225
|
+
adjustmentFactor: 0.2 // Adjust by 20% each time
|
|
226
|
+
});
|
|
227
|
+
|
|
228
|
+
await adaptiveProcessor.processWithAdaptiveSizing(client, jobId, allRecords);
|
|
229
|
+
```
|
|
230
|
+
|
|
231
|
+
## Parallel Processing
|
|
232
|
+
|
|
233
|
+
### Concurrency Control
|
|
234
|
+
|
|
235
|
+
```typescript
|
|
236
|
+
async function parallelFileProcessing(
|
|
237
|
+
files: string[],
|
|
238
|
+
concurrency: number = 5
|
|
239
|
+
): Promise<void> {
|
|
240
|
+
const queue = [...files];
|
|
241
|
+
const results = [];
|
|
242
|
+
|
|
243
|
+
// Create worker pool
|
|
244
|
+
const workers = Array(concurrency)
|
|
245
|
+
.fill(null)
|
|
246
|
+
.map(async () => {
|
|
247
|
+
while (queue.length > 0) {
|
|
248
|
+
const file = queue.shift();
|
|
249
|
+
if (file) {
|
|
250
|
+
try {
|
|
251
|
+
const result = await processFile(file);
|
|
252
|
+
results.push({ file, status: 'success', result });
|
|
253
|
+
} catch (error) {
|
|
254
|
+
results.push({ file, status: 'error', error: error.message });
|
|
255
|
+
}
|
|
256
|
+
}
|
|
257
|
+
}
|
|
258
|
+
});
|
|
259
|
+
|
|
260
|
+
// Wait for all workers to complete
|
|
261
|
+
await Promise.all(workers);
|
|
262
|
+
|
|
263
|
+
console.log(\`Processed \${results.length} files with \${concurrency} workers\`);
|
|
264
|
+
return results;
|
|
265
|
+
}
|
|
266
|
+
```
|
|
267
|
+
|
|
268
|
+
### Production Parallel Processing with Error Handling
|
|
269
|
+
|
|
270
|
+
Complete parallel processing implementation with retry logic and error isolation:
|
|
271
|
+
|
|
272
|
+
```typescript
|
|
273
|
+
import {
|
|
274
|
+
createClient,
|
|
275
|
+
S3DataSource,
|
|
276
|
+
CSVParserService,
|
|
277
|
+
UniversalMapper,
|
|
278
|
+
StateService,
|
|
279
|
+
FluentClient
|
|
280
|
+
} from '@fluentcommerce/fc-connect-sdk';
|
|
281
|
+
|
|
282
|
+
interface ProcessingResult {
|
|
283
|
+
fileName: string;
|
|
284
|
+
status: 'success' | 'error' | 'skipped';
|
|
285
|
+
recordsProcessed?: number;
|
|
286
|
+
error?: string;
|
|
287
|
+
processingTimeMs?: number;
|
|
288
|
+
}
|
|
289
|
+
|
|
290
|
+
interface ParallelIngestionConfig {
|
|
291
|
+
concurrency: number;
|
|
292
|
+
batchSize: number;
|
|
293
|
+
maxRetries: number;
|
|
294
|
+
retryDelayMs: number;
|
|
295
|
+
}
|
|
296
|
+
|
|
297
|
+
class ParallelIngestionProcessor {
|
|
298
|
+
constructor(
|
|
299
|
+
private client: FluentClient,
|
|
300
|
+
private s3: S3DataSource,
|
|
301
|
+
private parser: CSVParserService,
|
|
302
|
+
private mapper: UniversalMapper,
|
|
303
|
+
private state: StateService,
|
|
304
|
+
private kv: KVStore,
|
|
305
|
+
private config: ParallelIngestionConfig
|
|
306
|
+
) {}
|
|
307
|
+
|
|
308
|
+
/**
|
|
309
|
+
* Process multiple files in parallel with concurrency control
|
|
310
|
+
*/
|
|
311
|
+
async processFiles(
|
|
312
|
+
bucket: string,
|
|
313
|
+
fileKeys: string[]
|
|
314
|
+
): Promise<ProcessingResult[]> {
|
|
315
|
+
const results: ProcessingResult[] = [];
|
|
316
|
+
const queue = [...fileKeys];
|
|
317
|
+
let activeWorkers = 0;
|
|
318
|
+
const maxWorkers = this.config.concurrency;
|
|
319
|
+
|
|
320
|
+
console.log(`Processing ${fileKeys.length} files with ${maxWorkers} workers`);
|
|
321
|
+
|
|
322
|
+
// Create worker pool
|
|
323
|
+
const workers = Array(maxWorkers)
|
|
324
|
+
.fill(null)
|
|
325
|
+
.map(async (_, workerId) => {
|
|
326
|
+
while (queue.length > 0) {
|
|
327
|
+
const fileKey = queue.shift();
|
|
328
|
+
if (!fileKey) break;
|
|
329
|
+
|
|
330
|
+
activeWorkers++;
|
|
331
|
+
console.log(`Worker ${workerId}: Processing ${fileKey} (${activeWorkers} active)`);
|
|
332
|
+
|
|
333
|
+
const result = await this.processFileWithRetry(bucket, fileKey);
|
|
334
|
+
results.push(result);
|
|
335
|
+
|
|
336
|
+
activeWorkers--;
|
|
337
|
+
console.log(`Worker ${workerId}: Completed ${fileKey} (${activeWorkers} active)`);
|
|
338
|
+
}
|
|
339
|
+
});
|
|
340
|
+
|
|
341
|
+
// Wait for all workers to finish
|
|
342
|
+
await Promise.all(workers);
|
|
343
|
+
|
|
344
|
+
return results;
|
|
345
|
+
}
|
|
346
|
+
|
|
347
|
+
/**
|
|
348
|
+
* Process a single file with retry logic
|
|
349
|
+
*/
|
|
350
|
+
private async processFileWithRetry(
|
|
351
|
+
bucket: string,
|
|
352
|
+
fileKey: string
|
|
353
|
+
): Promise<ProcessingResult> {
|
|
354
|
+
const startTime = Date.now();
|
|
355
|
+
let lastError: Error | null = null;
|
|
356
|
+
|
|
357
|
+
// Check if already processed
|
|
358
|
+
if (await this.state.isFileProcessed(this.kv, fileKey)) {
|
|
359
|
+
return {
|
|
360
|
+
fileName: fileKey,
|
|
361
|
+
status: 'skipped',
|
|
362
|
+
processingTimeMs: Date.now() - startTime
|
|
363
|
+
};
|
|
364
|
+
}
|
|
365
|
+
|
|
366
|
+
// Retry loop
|
|
367
|
+
for (let attempt = 1; attempt <= this.config.maxRetries; attempt++) {
|
|
368
|
+
try {
|
|
369
|
+
const recordsProcessed = await this.processFileSingle(bucket, fileKey);
|
|
370
|
+
|
|
371
|
+
return {
|
|
372
|
+
fileName: fileKey,
|
|
373
|
+
status: 'success',
|
|
374
|
+
recordsProcessed,
|
|
375
|
+
processingTimeMs: Date.now() - startTime
|
|
376
|
+
};
|
|
377
|
+
|
|
378
|
+
} catch (error) {
|
|
379
|
+
lastError = error as Error;
|
|
380
|
+
console.error(
|
|
381
|
+
`Attempt ${attempt}/${this.config.maxRetries} failed for ${fileKey}: ${error.message}`
|
|
382
|
+
);
|
|
383
|
+
|
|
384
|
+
if (attempt < this.config.maxRetries) {
|
|
385
|
+
const delay = this.config.retryDelayMs * attempt;
|
|
386
|
+
console.log(`Retrying ${fileKey} in ${delay}ms...`);
|
|
387
|
+
await new Promise(resolve => setTimeout(resolve, delay));
|
|
388
|
+
}
|
|
389
|
+
}
|
|
390
|
+
}
|
|
391
|
+
|
|
392
|
+
// All retries exhausted
|
|
393
|
+
return {
|
|
394
|
+
fileName: fileKey,
|
|
395
|
+
status: 'error',
|
|
396
|
+
error: lastError?.message || 'Unknown error',
|
|
397
|
+
processingTimeMs: Date.now() - startTime
|
|
398
|
+
};
|
|
399
|
+
}
|
|
400
|
+
|
|
401
|
+
/**
|
|
402
|
+
* Process a single file
|
|
403
|
+
*/
|
|
404
|
+
private async processFileSingle(
|
|
405
|
+
bucket: string,
|
|
406
|
+
fileKey: string
|
|
407
|
+
): Promise<number> {
|
|
408
|
+
// Read file from S3
|
|
409
|
+
const fileContent = await this.s3.downloadFile(fileKey);
|
|
410
|
+
|
|
411
|
+
// Parse CSV
|
|
412
|
+
const records = await this.parser.parse(fileContent);
|
|
413
|
+
|
|
414
|
+
if (records.length === 0) {
|
|
415
|
+
throw new Error('File contains no records');
|
|
416
|
+
}
|
|
417
|
+
|
|
418
|
+
// Map fields
|
|
419
|
+
const mappingResult = await this.mapper.map(records);
|
|
420
|
+
|
|
421
|
+
if (!mappingResult.success || mappingResult.data.length === 0) {
|
|
422
|
+
throw new Error(`Field mapping failed: ${mappingResult.errors.join(', ')}`);
|
|
423
|
+
}
|
|
424
|
+
|
|
425
|
+
// Create job
|
|
426
|
+
const job = await this.client.createJob({
|
|
427
|
+
name: `Import - ${fileKey}`,
|
|
428
|
+
retailerId: process.env.FLUENT_RETAILER_ID!,
|
|
429
|
+
metadata: {
|
|
430
|
+
fileName: fileKey,
|
|
431
|
+
recordCount: mappingResult.data.length
|
|
432
|
+
}
|
|
433
|
+
});
|
|
434
|
+
|
|
435
|
+
// Send in batches
|
|
436
|
+
const batches = this.createBatches(mappingResult.data, this.config.batchSize);
|
|
437
|
+
|
|
438
|
+
for (const [index, batch] of batches.entries()) {
|
|
439
|
+
await this.client.sendBatch(job.id, {
|
|
440
|
+
action: 'UPSERT',
|
|
441
|
+
entityType: 'INVENTORY',
|
|
442
|
+
entities: batch
|
|
443
|
+
});
|
|
444
|
+
|
|
445
|
+
console.log(`Sent batch ${index + 1}/${batches.length} for ${fileKey}`);
|
|
446
|
+
}
|
|
447
|
+
|
|
448
|
+
// Mark as processed
|
|
449
|
+
await this.state.markFileProcessed(fileKey, {
|
|
450
|
+
jobId: job.id,
|
|
451
|
+
recordCount: mappingResult.data.length,
|
|
452
|
+
timestamp: new Date().toISOString()
|
|
453
|
+
});
|
|
454
|
+
|
|
455
|
+
return mappingResult.data.length;
|
|
456
|
+
}
|
|
457
|
+
|
|
458
|
+
/**
|
|
459
|
+
* Split records into batches
|
|
460
|
+
*/
|
|
461
|
+
private createBatches<T>(items: T[], batchSize: number): T[][] {
|
|
462
|
+
const batches: T[][] = [];
|
|
463
|
+
for (let i = 0; i < items.length; i += batchSize) {
|
|
464
|
+
batches.push(items.slice(i, i + batchSize));
|
|
465
|
+
}
|
|
466
|
+
return batches;
|
|
467
|
+
}
|
|
468
|
+
}
|
|
469
|
+
|
|
470
|
+
// Usage example
|
|
471
|
+
const kvAdapter = new VersoriKVAdapter(openKv(':project:'));
|
|
472
|
+
const processor = new ParallelIngestionProcessor(
|
|
473
|
+
client,
|
|
474
|
+
s3DataSource,
|
|
475
|
+
csvParser,
|
|
476
|
+
mapper,
|
|
477
|
+
stateService,
|
|
478
|
+
kvAdapter,
|
|
479
|
+
{
|
|
480
|
+
concurrency: 5,
|
|
481
|
+
batchSize: 2000,
|
|
482
|
+
maxRetries: 3,
|
|
483
|
+
retryDelayMs: 1000
|
|
484
|
+
}
|
|
485
|
+
);
|
|
486
|
+
|
|
487
|
+
const files = await s3DataSource.listFiles({ prefix: 'data/' });
|
|
488
|
+
const results = await processor.processFiles('bucket', files.map(f => f.path));
|
|
489
|
+
|
|
490
|
+
// Summary
|
|
491
|
+
const successful = results.filter(r => r.status === 'success').length;
|
|
492
|
+
const failed = results.filter(r => r.status === 'error').length;
|
|
493
|
+
const skipped = results.filter(r => r.status === 'skipped').length;
|
|
494
|
+
|
|
495
|
+
console.log(`Processing complete: ${successful} successful, ${failed} failed, ${skipped} skipped`);
|
|
496
|
+
```
|
|
497
|
+
|
|
498
|
+
### Rate Limiting
|
|
499
|
+
|
|
500
|
+
```typescript
|
|
501
|
+
class RateLimiter {
|
|
502
|
+
private tokens: number;
|
|
503
|
+
private lastRefill: number;
|
|
504
|
+
|
|
505
|
+
constructor(
|
|
506
|
+
private maxTokens: number,
|
|
507
|
+
private refillRate: number // tokens per second
|
|
508
|
+
) {
|
|
509
|
+
this.tokens = maxTokens;
|
|
510
|
+
this.lastRefill = Date.now();
|
|
511
|
+
}
|
|
512
|
+
|
|
513
|
+
async acquire(tokens: number = 1): Promise<void> {
|
|
514
|
+
await this.refillTokens();
|
|
515
|
+
|
|
516
|
+
while (this.tokens < tokens) {
|
|
517
|
+
await new Promise(resolve => setTimeout(resolve, 100));
|
|
518
|
+
await this.refillTokens();
|
|
519
|
+
}
|
|
520
|
+
|
|
521
|
+
this.tokens -= tokens;
|
|
522
|
+
}
|
|
523
|
+
|
|
524
|
+
private async refillTokens(): Promise<void> {
|
|
525
|
+
const now = Date.now();
|
|
526
|
+
const timePassed = (now - this.lastRefill) / 1000;
|
|
527
|
+
const tokensToAdd = timePassed * this.refillRate;
|
|
528
|
+
|
|
529
|
+
this.tokens = Math.min(this.maxTokens, this.tokens + tokensToAdd);
|
|
530
|
+
this.lastRefill = now;
|
|
531
|
+
}
|
|
532
|
+
}
|
|
533
|
+
|
|
534
|
+
// Usage
|
|
535
|
+
const rateLimiter = new RateLimiter(100, 10); // 100 max, 10 per second
|
|
536
|
+
|
|
537
|
+
async function rateLimitedProcessing(files: string[]) {
|
|
538
|
+
for (const file of files) {
|
|
539
|
+
await rateLimiter.acquire();
|
|
540
|
+
await processFile(file);
|
|
541
|
+
}
|
|
542
|
+
}
|
|
543
|
+
```
|
|
544
|
+
|
|
545
|
+
## Job Strategy Comparison
|
|
546
|
+
|
|
547
|
+
### DAILY Strategy
|
|
548
|
+
|
|
549
|
+
**When to use:** Frequent, small updates throughout the day
|
|
550
|
+
|
|
551
|
+
```typescript
|
|
552
|
+
async function dailyJobStrategy(client: FluentClient) {
|
|
553
|
+
const dateKey = new Date().toISOString().split('T')[0];
|
|
554
|
+
const jobName = \`Daily Inventory - \${dateKey}\`;
|
|
555
|
+
|
|
556
|
+
// Reuse same job for all files processed today
|
|
557
|
+
let jobId = await findOrCreateDailyJob(client, dateKey);
|
|
558
|
+
|
|
559
|
+
const files = await s3.listFiles({ prefix: 'data/' });
|
|
560
|
+
|
|
561
|
+
for (const file of files) {
|
|
562
|
+
// All files use the same job
|
|
563
|
+
await client.sendBatch(jobId, {
|
|
564
|
+
action: 'UPSERT',
|
|
565
|
+
entityType: 'INVENTORY',
|
|
566
|
+
entities: fileData
|
|
567
|
+
});
|
|
568
|
+
}
|
|
569
|
+
}
|
|
570
|
+
|
|
571
|
+
async function findOrCreateDailyJob(
|
|
572
|
+
client: FluentClient,
|
|
573
|
+
dateKey: string
|
|
574
|
+
): Promise<string> {
|
|
575
|
+
// Check state for today's job
|
|
576
|
+
const existingJobId = await state.get(\`daily-job:\${dateKey}\`);
|
|
577
|
+
|
|
578
|
+
if (existingJobId) {
|
|
579
|
+
return existingJobId;
|
|
580
|
+
}
|
|
581
|
+
|
|
582
|
+
// Create new daily job
|
|
583
|
+
const job = await client.createJob({
|
|
584
|
+
name: \`Daily Inventory - \${dateKey}\`,
|
|
585
|
+
retailerId: process.env.FLUENT_RETAILER_ID!
|
|
586
|
+
});
|
|
587
|
+
|
|
588
|
+
// Cache for today
|
|
589
|
+
await state.set(\`daily-job:\${dateKey}\`, job.id, 24 * 60 * 60 * 1000);
|
|
590
|
+
|
|
591
|
+
return job.id;
|
|
592
|
+
}
|
|
593
|
+
```
|
|
594
|
+
|
|
595
|
+
**Pros:**
|
|
596
|
+
- ✅ Fewer jobs to track
|
|
597
|
+
- ✅ Consolidated daily batches
|
|
598
|
+
- ✅ Easier monitoring
|
|
599
|
+
|
|
600
|
+
**Cons:**
|
|
601
|
+
- ❌ All files succeed/fail together
|
|
602
|
+
- ❌ Harder to track individual file status
|
|
603
|
+
|
|
604
|
+
### PER_FILE Strategy
|
|
605
|
+
|
|
606
|
+
**When to use:** Large files, independent processing, failure isolation
|
|
607
|
+
|
|
608
|
+
```typescript
|
|
609
|
+
async function perFileJobStrategy(client: FluentClient) {
|
|
610
|
+
const files = await s3.listFiles({ prefix: 'data/' });
|
|
611
|
+
|
|
612
|
+
for (const file of files) {
|
|
613
|
+
// Create dedicated job for each file
|
|
614
|
+
const job = await client.createJob({
|
|
615
|
+
name: \`Import - \${file.path}\`,
|
|
616
|
+
retailerId: process.env.FLUENT_RETAILER_ID!,
|
|
617
|
+
metadata: {
|
|
618
|
+
fileName: file.path,
|
|
619
|
+
fileSize: file.size
|
|
620
|
+
}
|
|
621
|
+
});
|
|
622
|
+
|
|
623
|
+
await client.sendBatch(job.id, {
|
|
624
|
+
action: 'UPSERT',
|
|
625
|
+
entityType: 'INVENTORY',
|
|
626
|
+
entities: fileData
|
|
627
|
+
});
|
|
628
|
+
|
|
629
|
+
// Track job per file
|
|
630
|
+
await state.markFileProcessed(file.path, {
|
|
631
|
+
jobId: job.id,
|
|
632
|
+
status: 'success'
|
|
633
|
+
});
|
|
634
|
+
}
|
|
635
|
+
}
|
|
636
|
+
```
|
|
637
|
+
|
|
638
|
+
**Pros:**
|
|
639
|
+
- ✅ Independent file processing
|
|
640
|
+
- ✅ Fine-grained error tracking
|
|
641
|
+
- ✅ Easy retry of failed files
|
|
642
|
+
|
|
643
|
+
**Cons:**
|
|
644
|
+
- ❌ More jobs to manage
|
|
645
|
+
- ❌ More API calls
|
|
646
|
+
|
|
647
|
+
## Memory Optimization
|
|
648
|
+
|
|
649
|
+
### Streaming Large Files
|
|
650
|
+
|
|
651
|
+
```typescript
|
|
652
|
+
async function streamLargeFile(filePath: string) {
|
|
653
|
+
const s3 = new S3DataSource(
|
|
654
|
+
{
|
|
655
|
+
type: 'S3_CSV',
|
|
656
|
+
connectionId: 's3-stream',
|
|
657
|
+
name: 'S3 Stream',
|
|
658
|
+
s3Config: s3Config,
|
|
659
|
+
},
|
|
660
|
+
logger
|
|
661
|
+
);
|
|
662
|
+
const parser = new CSVParserService();
|
|
663
|
+
|
|
664
|
+
// Stream file from S3 (no full file in memory)
|
|
665
|
+
const data = await s3.downloadFile(filePath);
|
|
666
|
+
const recordIterator = parser.parseStreaming(data);
|
|
667
|
+
|
|
668
|
+
let batch = [];
|
|
669
|
+
const BATCH_SIZE = 1000;
|
|
670
|
+
|
|
671
|
+
for await (const record of recordIterator) {
|
|
672
|
+
batch.push(record);
|
|
673
|
+
|
|
674
|
+
if (batch.length >= BATCH_SIZE) {
|
|
675
|
+
await processBatch(batch);
|
|
676
|
+
batch = []; // Free memory
|
|
677
|
+
}
|
|
678
|
+
}
|
|
679
|
+
|
|
680
|
+
// Process remaining records
|
|
681
|
+
if (batch.length > 0) {
|
|
682
|
+
await processBatch(batch);
|
|
683
|
+
}
|
|
684
|
+
}
|
|
685
|
+
```
|
|
686
|
+
|
|
687
|
+
### Chunked Processing for Memory Efficiency
|
|
688
|
+
|
|
689
|
+
Process large files in chunks to prevent memory exhaustion:
|
|
690
|
+
|
|
691
|
+
```typescript
|
|
692
|
+
import {
|
|
693
|
+
createClient,
|
|
694
|
+
S3DataSource,
|
|
695
|
+
CSVParserService,
|
|
696
|
+
UniversalMapper,
|
|
697
|
+
FluentClient
|
|
698
|
+
} from '@fluentcommerce/fc-connect-sdk';
|
|
699
|
+
|
|
700
|
+
interface ChunkProcessingConfig {
|
|
701
|
+
chunkSize: number; // Records per chunk
|
|
702
|
+
batchSize: number; // Records per batch API call
|
|
703
|
+
enableStreaming: boolean;
|
|
704
|
+
}
|
|
705
|
+
|
|
706
|
+
async function processLargeFileInChunks(
|
|
707
|
+
client: FluentClient,
|
|
708
|
+
s3: S3DataSource,
|
|
709
|
+
bucket: string,
|
|
710
|
+
fileKey: string,
|
|
711
|
+
mapper: UniversalMapper,
|
|
712
|
+
config: ChunkProcessingConfig
|
|
713
|
+
): Promise<void> {
|
|
714
|
+
console.log(`Processing ${fileKey} with streaming: ${config.enableStreaming}`);
|
|
715
|
+
|
|
716
|
+
let totalProcessed = 0;
|
|
717
|
+
let currentChunk: any[] = [];
|
|
718
|
+
|
|
719
|
+
if (config.enableStreaming) {
|
|
720
|
+
// Streaming approach - memory efficient
|
|
721
|
+
const data = await s3.downloadFile(filePath);
|
|
722
|
+
const parser = new CSVParserService();
|
|
723
|
+
const recordIterator = parser.parseStreaming(data);
|
|
724
|
+
|
|
725
|
+
for await (const record of recordIterator) {
|
|
726
|
+
currentChunk.push(record);
|
|
727
|
+
|
|
728
|
+
if (currentChunk.length >= config.chunkSize) {
|
|
729
|
+
const processed = await processChunk(
|
|
730
|
+
client,
|
|
731
|
+
currentChunk,
|
|
732
|
+
mapper,
|
|
733
|
+
config.batchSize,
|
|
734
|
+
fileKey
|
|
735
|
+
);
|
|
736
|
+
totalProcessed += processed;
|
|
737
|
+
|
|
738
|
+
// Clear chunk from memory
|
|
739
|
+
currentChunk = [];
|
|
740
|
+
|
|
741
|
+
// Log memory usage
|
|
742
|
+
logMemoryUsage(`After chunk (${totalProcessed} records)`);
|
|
743
|
+
}
|
|
744
|
+
}
|
|
745
|
+
} else {
|
|
746
|
+
// Non-streaming approach - load entire file
|
|
747
|
+
const fileContent = await s3.downloadFile(filePath);
|
|
748
|
+
const parser = new CSVParserService();
|
|
749
|
+
const allRecords = await parser.parse(fileContent);
|
|
750
|
+
|
|
751
|
+
// Process in chunks
|
|
752
|
+
for (let i = 0; i < allRecords.length; i += config.chunkSize) {
|
|
753
|
+
const chunk = allRecords.slice(i, i + config.chunkSize);
|
|
754
|
+
const processed = await processChunk(
|
|
755
|
+
client,
|
|
756
|
+
chunk,
|
|
757
|
+
mapper,
|
|
758
|
+
config.batchSize,
|
|
759
|
+
fileKey
|
|
760
|
+
);
|
|
761
|
+
totalProcessed += processed;
|
|
762
|
+
|
|
763
|
+
logMemoryUsage(`After chunk (${totalProcessed} records)`);
|
|
764
|
+
}
|
|
765
|
+
}
|
|
766
|
+
|
|
767
|
+
// Process remaining records
|
|
768
|
+
if (currentChunk.length > 0) {
|
|
769
|
+
const processed = await processChunk(
|
|
770
|
+
client,
|
|
771
|
+
currentChunk,
|
|
772
|
+
mapper,
|
|
773
|
+
config.batchSize,
|
|
774
|
+
fileKey
|
|
775
|
+
);
|
|
776
|
+
totalProcessed += processed;
|
|
777
|
+
}
|
|
778
|
+
|
|
779
|
+
console.log(`Completed ${fileKey}: ${totalProcessed} records processed`);
|
|
780
|
+
}
|
|
781
|
+
|
|
782
|
+
async function processChunk(
|
|
783
|
+
client: FluentClient,
|
|
784
|
+
chunk: any[],
|
|
785
|
+
mapper: UniversalMapper,
|
|
786
|
+
batchSize: number,
|
|
787
|
+
fileKey: string
|
|
788
|
+
): Promise<number> {
|
|
789
|
+
// Map fields
|
|
790
|
+
const mappingResult = await mapper.map(chunk);
|
|
791
|
+
|
|
792
|
+
if (!mappingResult.success) {
|
|
793
|
+
console.error(`Mapping errors in ${fileKey}:`, mappingResult.errors);
|
|
794
|
+
return 0;
|
|
795
|
+
}
|
|
796
|
+
|
|
797
|
+
// Create job for this chunk
|
|
798
|
+
const job = await client.createJob({
|
|
799
|
+
name: `Import Chunk - ${fileKey} - ${new Date().toISOString()}`,
|
|
800
|
+
retailerId: process.env.FLUENT_RETAILER_ID!
|
|
801
|
+
});
|
|
802
|
+
|
|
803
|
+
// Send in batches
|
|
804
|
+
const batches = createBatches(mappingResult.data, batchSize);
|
|
805
|
+
|
|
806
|
+
for (const batch of batches) {
|
|
807
|
+
await client.sendBatch(job.id, {
|
|
808
|
+
action: 'UPSERT',
|
|
809
|
+
entityType: 'INVENTORY',
|
|
810
|
+
entities: batch
|
|
811
|
+
});
|
|
812
|
+
}
|
|
813
|
+
|
|
814
|
+
return mappingResult.data.length;
|
|
815
|
+
}
|
|
816
|
+
|
|
817
|
+
function createBatches<T>(items: T[], batchSize: number): T[][] {
|
|
818
|
+
const batches: T[][] = [];
|
|
819
|
+
for (let i = 0; i < items.length; i += batchSize) {
|
|
820
|
+
batches.push(items.slice(i, i + batchSize));
|
|
821
|
+
}
|
|
822
|
+
return batches;
|
|
823
|
+
}
|
|
824
|
+
|
|
825
|
+
// Usage
|
|
826
|
+
await processLargeFileInChunks(
|
|
827
|
+
client,
|
|
828
|
+
s3DataSource,
|
|
829
|
+
'inventory-bucket',
|
|
830
|
+
'large-file-500k-records.csv',
|
|
831
|
+
mapper,
|
|
832
|
+
{
|
|
833
|
+
chunkSize: 5000, // Process 5K records at a time
|
|
834
|
+
batchSize: 2000, // Send 2K records per batch
|
|
835
|
+
enableStreaming: true
|
|
836
|
+
}
|
|
837
|
+
);
|
|
838
|
+
```
|
|
839
|
+
|
|
840
|
+
### Memory Monitoring
|
|
841
|
+
|
|
842
|
+
```typescript
|
|
843
|
+
function logMemoryUsage(label: string) {
|
|
844
|
+
const used = process.memoryUsage();
|
|
845
|
+
console.log(\`[\${label}] Memory Usage:\`);
|
|
846
|
+
console.log(\` RSS: \${Math.round(used.rss / 1024 / 1024)} MB\`);
|
|
847
|
+
console.log(\` Heap Total: \${Math.round(used.heapTotal / 1024 / 1024)} MB\`);
|
|
848
|
+
console.log(\` Heap Used: \${Math.round(used.heapUsed / 1024 / 1024)} MB\`);
|
|
849
|
+
}
|
|
850
|
+
|
|
851
|
+
async function memoryEfficientProcessing() {
|
|
852
|
+
logMemoryUsage('Start');
|
|
853
|
+
|
|
854
|
+
const files = await listFiles();
|
|
855
|
+
logMemoryUsage('After listing files');
|
|
856
|
+
|
|
857
|
+
for (const file of files) {
|
|
858
|
+
await streamLargeFile(file);
|
|
859
|
+
logMemoryUsage(\`After processing \${file}\`);
|
|
860
|
+
|
|
861
|
+
// Force garbage collection if available
|
|
862
|
+
if (global.gc) {
|
|
863
|
+
global.gc();
|
|
864
|
+
}
|
|
865
|
+
}
|
|
866
|
+
|
|
867
|
+
logMemoryUsage('End');
|
|
868
|
+
}
|
|
869
|
+
```
|
|
870
|
+
|
|
871
|
+
## Performance Monitoring
|
|
872
|
+
|
|
873
|
+
### Metrics Collection
|
|
874
|
+
|
|
875
|
+
```typescript
|
|
876
|
+
interface PerformanceMetrics {
|
|
877
|
+
filesProcessed: number;
|
|
878
|
+
recordsProcessed: number;
|
|
879
|
+
totalTimeMs: number;
|
|
880
|
+
avgTimePerFile: number;
|
|
881
|
+
avgRecordsPerSecond: number;
|
|
882
|
+
batchesSent: number;
|
|
883
|
+
errors: number;
|
|
884
|
+
}
|
|
885
|
+
|
|
886
|
+
async function trackPerformance(
|
|
887
|
+
operation: () => Promise<void>
|
|
888
|
+
): Promise<PerformanceMetrics> {
|
|
889
|
+
const startTime = Date.now();
|
|
890
|
+
const metrics: PerformanceMetrics = {
|
|
891
|
+
filesProcessed: 0,
|
|
892
|
+
recordsProcessed: 0,
|
|
893
|
+
totalTimeMs: 0,
|
|
894
|
+
avgTimePerFile: 0,
|
|
895
|
+
avgRecordsPerSecond: 0,
|
|
896
|
+
batchesSent: 0,
|
|
897
|
+
errors: 0
|
|
898
|
+
};
|
|
899
|
+
|
|
900
|
+
await operation();
|
|
901
|
+
|
|
902
|
+
metrics.totalTimeMs = Date.now() - startTime;
|
|
903
|
+
metrics.avgTimePerFile = metrics.totalTimeMs / metrics.filesProcessed;
|
|
904
|
+
metrics.avgRecordsPerSecond =
|
|
905
|
+
(metrics.recordsProcessed / metrics.totalTimeMs) * 1000;
|
|
906
|
+
|
|
907
|
+
return metrics;
|
|
908
|
+
}
|
|
909
|
+
```
|
|
910
|
+
|
|
911
|
+
## Complete Optimized Production Example
|
|
912
|
+
|
|
913
|
+
Putting it all together: adaptive sizing, parallel processing, streaming, and monitoring:
|
|
914
|
+
|
|
915
|
+
```typescript
|
|
916
|
+
import {
|
|
917
|
+
createClient,
|
|
918
|
+
S3DataSource,
|
|
919
|
+
CSVParserService,
|
|
920
|
+
UniversalMapper,
|
|
921
|
+
StateService,
|
|
922
|
+
VersoriKVAdapter,
|
|
923
|
+
FluentClient
|
|
924
|
+
} from '@fluentcommerce/fc-connect-sdk';
|
|
925
|
+
|
|
926
|
+
/**
|
|
927
|
+
* Production-grade optimized ingestion pipeline
|
|
928
|
+
* Combines all performance optimization techniques
|
|
929
|
+
*/
|
|
930
|
+
class OptimizedIngestionPipeline {
|
|
931
|
+
private client: FluentClient;
|
|
932
|
+
private s3: S3DataSource;
|
|
933
|
+
private parser: CSVParserService;
|
|
934
|
+
private mapper: UniversalMapper;
|
|
935
|
+
private state: StateService;
|
|
936
|
+
private kv: KVStore;
|
|
937
|
+
private adaptiveProcessor: AdaptiveBatchProcessor;
|
|
938
|
+
private metrics: PerformanceMetrics;
|
|
939
|
+
|
|
940
|
+
constructor(
|
|
941
|
+
private config: {
|
|
942
|
+
fluent: any;
|
|
943
|
+
s3: any;
|
|
944
|
+
mapping: any;
|
|
945
|
+
kv: any;
|
|
946
|
+
concurrency: number;
|
|
947
|
+
enableStreaming: boolean;
|
|
948
|
+
enableAdaptiveSizing: boolean;
|
|
949
|
+
}
|
|
950
|
+
) {}
|
|
951
|
+
|
|
952
|
+
/**
|
|
953
|
+
* Initialize all components
|
|
954
|
+
*/
|
|
955
|
+
async initialize(): Promise<void> {
|
|
956
|
+
// Create Fluent client
|
|
957
|
+
this.client = await createClient({ config: this.config.fluent });
|
|
958
|
+
|
|
959
|
+
// Initialize logger
|
|
960
|
+
const logger = toStructuredLogger(createConsoleLogger(), {
|
|
961
|
+
service: 'optimized-ingestion',
|
|
962
|
+
correlationId: generateCorrelationId()
|
|
963
|
+
});
|
|
964
|
+
|
|
965
|
+
// Initialize data sources
|
|
966
|
+
this.s3 = new S3DataSource(
|
|
967
|
+
{
|
|
968
|
+
type: 'S3_CSV',
|
|
969
|
+
connectionId: 's3-optimized',
|
|
970
|
+
name: 'S3 Optimized',
|
|
971
|
+
s3Config: this.config.s3,
|
|
972
|
+
},
|
|
973
|
+
logger
|
|
974
|
+
);
|
|
975
|
+
this.parser = new CSVParserService(logger);
|
|
976
|
+
|
|
977
|
+
// Initialize field mapper
|
|
978
|
+
this.mapper = new UniversalMapper(this.config.mapping);
|
|
979
|
+
|
|
980
|
+
// Initialize state management
|
|
981
|
+
this.kv = new VersoriKVAdapter(this.config.kv);
|
|
982
|
+
this.stateService = new StateService(logger);
|
|
983
|
+
|
|
984
|
+
// Initialize adaptive processor
|
|
985
|
+
if (this.config.enableAdaptiveSizing) {
|
|
986
|
+
this.adaptiveProcessor = new AdaptiveBatchProcessor({
|
|
987
|
+
minBatchSize: 500,
|
|
988
|
+
maxBatchSize: 5000,
|
|
989
|
+
targetProcessingTimeMs: 3000,
|
|
990
|
+
adjustmentFactor: 0.2
|
|
991
|
+
});
|
|
992
|
+
}
|
|
993
|
+
|
|
994
|
+
// Initialize metrics
|
|
995
|
+
this.metrics = {
|
|
996
|
+
filesProcessed: 0,
|
|
997
|
+
recordsProcessed: 0,
|
|
998
|
+
totalTimeMs: 0,
|
|
999
|
+
avgTimePerFile: 0,
|
|
1000
|
+
avgRecordsPerSecond: 0,
|
|
1001
|
+
batchesSent: 0,
|
|
1002
|
+
errors: 0
|
|
1003
|
+
};
|
|
1004
|
+
|
|
1005
|
+
console.log('Optimized ingestion pipeline initialized');
|
|
1006
|
+
}
|
|
1007
|
+
|
|
1008
|
+
/**
|
|
1009
|
+
* Run the complete ingestion pipeline
|
|
1010
|
+
*/
|
|
1011
|
+
async run(bucket: string, prefix: string): Promise<PerformanceMetrics> {
|
|
1012
|
+
const startTime = Date.now();
|
|
1013
|
+
|
|
1014
|
+
try {
|
|
1015
|
+
// List all files
|
|
1016
|
+
console.log(`Listing files in s3://${bucket}/${prefix}`);
|
|
1017
|
+
const allFiles = await this.s3.listFiles({ prefix });
|
|
1018
|
+
console.log(`Found ${allFiles.length} files`);
|
|
1019
|
+
|
|
1020
|
+
// Filter unprocessed files
|
|
1021
|
+
const unprocessedFiles = [];
|
|
1022
|
+
for (const file of allFiles) {
|
|
1023
|
+
if (!(await this.state.isFileProcessed(this.kv, file.path))) {
|
|
1024
|
+
unprocessedFiles.push(file);
|
|
1025
|
+
}
|
|
1026
|
+
}
|
|
1027
|
+
|
|
1028
|
+
console.log(
|
|
1029
|
+
`${unprocessedFiles.length} files to process (${allFiles.length - unprocessedFiles.length} already processed)`
|
|
1030
|
+
);
|
|
1031
|
+
|
|
1032
|
+
if (unprocessedFiles.length === 0) {
|
|
1033
|
+
console.log('No files to process');
|
|
1034
|
+
return this.metrics;
|
|
1035
|
+
}
|
|
1036
|
+
|
|
1037
|
+
// Process files in parallel
|
|
1038
|
+
await this.processFilesInParallel(bucket, unprocessedFiles);
|
|
1039
|
+
|
|
1040
|
+
// Calculate metrics
|
|
1041
|
+
this.metrics.totalTimeMs = Date.now() - startTime;
|
|
1042
|
+
this.metrics.avgTimePerFile = this.metrics.totalTimeMs / this.metrics.filesProcessed;
|
|
1043
|
+
this.metrics.avgRecordsPerSecond =
|
|
1044
|
+
(this.metrics.recordsProcessed / this.metrics.totalTimeMs) * 1000;
|
|
1045
|
+
|
|
1046
|
+
// Log summary
|
|
1047
|
+
this.logSummary();
|
|
1048
|
+
|
|
1049
|
+
return this.metrics;
|
|
1050
|
+
|
|
1051
|
+
} catch (error) {
|
|
1052
|
+
console.error('Pipeline failed:', error);
|
|
1053
|
+
throw error;
|
|
1054
|
+
}
|
|
1055
|
+
}
|
|
1056
|
+
|
|
1057
|
+
/**
|
|
1058
|
+
* Process multiple files in parallel
|
|
1059
|
+
*/
|
|
1060
|
+
private async processFilesInParallel(
|
|
1061
|
+
bucket: string,
|
|
1062
|
+
files: Array<{ key: string; size: number }>
|
|
1063
|
+
): Promise<void> {
|
|
1064
|
+
const queue = [...files];
|
|
1065
|
+
const concurrency = this.config.concurrency;
|
|
1066
|
+
|
|
1067
|
+
console.log(`Processing ${files.length} files with ${concurrency} workers`);
|
|
1068
|
+
|
|
1069
|
+
// Create worker pool
|
|
1070
|
+
const workers = Array(concurrency)
|
|
1071
|
+
.fill(null)
|
|
1072
|
+
.map(async (_, workerId) => {
|
|
1073
|
+
while (queue.length > 0) {
|
|
1074
|
+
const file = queue.shift();
|
|
1075
|
+
if (!file) break;
|
|
1076
|
+
|
|
1077
|
+
console.log(
|
|
1078
|
+
`Worker ${workerId}: Processing ${file.path} (${queue.length} remaining)`
|
|
1079
|
+
);
|
|
1080
|
+
|
|
1081
|
+
try {
|
|
1082
|
+
await this.processFile(bucket, file.path);
|
|
1083
|
+
this.metrics.filesProcessed++;
|
|
1084
|
+
} catch (error) {
|
|
1085
|
+
console.error(`Worker ${workerId}: Failed to process ${file.path}:`, error);
|
|
1086
|
+
this.metrics.errors++;
|
|
1087
|
+
}
|
|
1088
|
+
}
|
|
1089
|
+
});
|
|
1090
|
+
|
|
1091
|
+
await Promise.all(workers);
|
|
1092
|
+
}
|
|
1093
|
+
|
|
1094
|
+
/**
|
|
1095
|
+
* Process a single file with all optimizations
|
|
1096
|
+
*/
|
|
1097
|
+
private async processFile(bucket: string, fileKey: string): Promise<void> {
|
|
1098
|
+
const fileStartTime = Date.now();
|
|
1099
|
+
|
|
1100
|
+
// Create job for this file
|
|
1101
|
+
const job = await this.client.createJob({
|
|
1102
|
+
name: `Optimized Import - ${fileKey}`,
|
|
1103
|
+
retailerId: this.config.fluent.retailerId,
|
|
1104
|
+
metadata: {
|
|
1105
|
+
fileName: fileKey,
|
|
1106
|
+
pipeline: 'optimized'
|
|
1107
|
+
}
|
|
1108
|
+
});
|
|
1109
|
+
|
|
1110
|
+
let recordsInFile = 0;
|
|
1111
|
+
|
|
1112
|
+
if (this.config.enableStreaming) {
|
|
1113
|
+
// Streaming approach - memory efficient
|
|
1114
|
+
recordsInFile = await this.processFileStreaming(bucket, fileKey, job.id);
|
|
1115
|
+
} else {
|
|
1116
|
+
// Standard approach - load entire file
|
|
1117
|
+
recordsInFile = await this.processFileStandard(bucket, fileKey, job.id);
|
|
1118
|
+
}
|
|
1119
|
+
|
|
1120
|
+
// Mark file as processed
|
|
1121
|
+
await this.state.markFileProcessed(fileKey, {
|
|
1122
|
+
jobId: job.id,
|
|
1123
|
+
recordCount: recordsInFile,
|
|
1124
|
+
processingTimeMs: Date.now() - fileStartTime,
|
|
1125
|
+
timestamp: new Date().toISOString()
|
|
1126
|
+
});
|
|
1127
|
+
|
|
1128
|
+
this.metrics.recordsProcessed += recordsInFile;
|
|
1129
|
+
|
|
1130
|
+
console.log(
|
|
1131
|
+
`Completed ${fileKey}: ${recordsInFile} records in ${Date.now() - fileStartTime}ms`
|
|
1132
|
+
);
|
|
1133
|
+
}
|
|
1134
|
+
|
|
1135
|
+
/**
|
|
1136
|
+
* Process file using streaming (memory efficient)
|
|
1137
|
+
*/
|
|
1138
|
+
private async processFileStreaming(
|
|
1139
|
+
bucket: string,
|
|
1140
|
+
fileKey: string,
|
|
1141
|
+
jobId: string
|
|
1142
|
+
): Promise<number> {
|
|
1143
|
+
const stream = await this.s3.streamObject(bucket, fileKey);
|
|
1144
|
+
const recordIterator = this.parser.parseStream(stream);
|
|
1145
|
+
|
|
1146
|
+
let batch: any[] = [];
|
|
1147
|
+
let totalRecords = 0;
|
|
1148
|
+
const batchSize = this.config.enableAdaptiveSizing
|
|
1149
|
+
? this.adaptiveProcessor.getCurrentBatchSize()
|
|
1150
|
+
: 2000;
|
|
1151
|
+
|
|
1152
|
+
for await (const record of recordIterator) {
|
|
1153
|
+
batch.push(record);
|
|
1154
|
+
|
|
1155
|
+
if (batch.length >= batchSize) {
|
|
1156
|
+
const mappingResult = await this.mapper.map(batch);
|
|
1157
|
+
|
|
1158
|
+
if (mappingResult.success && mappingResult.data.length > 0) {
|
|
1159
|
+
await this.client.sendBatch(jobId, {
|
|
1160
|
+
action: 'UPSERT',
|
|
1161
|
+
entityType: 'INVENTORY',
|
|
1162
|
+
entities: mappingResult.data
|
|
1163
|
+
});
|
|
1164
|
+
|
|
1165
|
+
this.metrics.batchesSent++;
|
|
1166
|
+
totalRecords += mappingResult.data.length;
|
|
1167
|
+
}
|
|
1168
|
+
|
|
1169
|
+
batch = []; // Free memory
|
|
1170
|
+
logMemoryUsage(`After batch (${totalRecords} records)`);
|
|
1171
|
+
}
|
|
1172
|
+
}
|
|
1173
|
+
|
|
1174
|
+
// Process remaining records
|
|
1175
|
+
if (batch.length > 0) {
|
|
1176
|
+
const mappingResult = await this.mapper.map(batch);
|
|
1177
|
+
|
|
1178
|
+
if (mappingResult.success && mappingResult.data.length > 0) {
|
|
1179
|
+
await this.client.sendBatch(jobId, {
|
|
1180
|
+
action: 'UPSERT',
|
|
1181
|
+
entityType: 'INVENTORY',
|
|
1182
|
+
entities: mappingResult.data
|
|
1183
|
+
});
|
|
1184
|
+
|
|
1185
|
+
this.metrics.batchesSent++;
|
|
1186
|
+
totalRecords += mappingResult.data.length;
|
|
1187
|
+
}
|
|
1188
|
+
}
|
|
1189
|
+
|
|
1190
|
+
return totalRecords;
|
|
1191
|
+
}
|
|
1192
|
+
|
|
1193
|
+
/**
|
|
1194
|
+
* Process file using standard approach (adaptive sizing)
|
|
1195
|
+
*/
|
|
1196
|
+
private async processFileStandard(
|
|
1197
|
+
bucket: string,
|
|
1198
|
+
fileKey: string,
|
|
1199
|
+
jobId: string
|
|
1200
|
+
): Promise<number> {
|
|
1201
|
+
// Read entire file
|
|
1202
|
+
const fileContent = await this.s3.downloadFile(fileKey);
|
|
1203
|
+
const records = await this.parser.parse(fileContent);
|
|
1204
|
+
|
|
1205
|
+
// Map all records
|
|
1206
|
+
const mappingResult = await this.mapper.map(records);
|
|
1207
|
+
|
|
1208
|
+
if (!mappingResult.success || mappingResult.data.length === 0) {
|
|
1209
|
+
throw new Error(`Field mapping failed: ${mappingResult.errors.join(', ')}`);
|
|
1210
|
+
}
|
|
1211
|
+
|
|
1212
|
+
// Use adaptive sizing or standard batching
|
|
1213
|
+
if (this.config.enableAdaptiveSizing) {
|
|
1214
|
+
await this.adaptiveProcessor.processWithAdaptiveSizing(
|
|
1215
|
+
this.client,
|
|
1216
|
+
jobId,
|
|
1217
|
+
mappingResult.data
|
|
1218
|
+
);
|
|
1219
|
+
this.metrics.batchesSent += Math.ceil(
|
|
1220
|
+
mappingResult.data.length / this.adaptiveProcessor.getCurrentBatchSize()
|
|
1221
|
+
);
|
|
1222
|
+
} else {
|
|
1223
|
+
const batches = this.createBatches(mappingResult.data, 2000);
|
|
1224
|
+
|
|
1225
|
+
for (const batch of batches) {
|
|
1226
|
+
await this.client.sendBatch(jobId, {
|
|
1227
|
+
action: 'UPSERT',
|
|
1228
|
+
entityType: 'INVENTORY',
|
|
1229
|
+
entities: batch
|
|
1230
|
+
});
|
|
1231
|
+
|
|
1232
|
+
this.metrics.batchesSent++;
|
|
1233
|
+
}
|
|
1234
|
+
}
|
|
1235
|
+
|
|
1236
|
+
return mappingResult.data.length;
|
|
1237
|
+
}
|
|
1238
|
+
|
|
1239
|
+
/**
|
|
1240
|
+
* Create batches from records
|
|
1241
|
+
*/
|
|
1242
|
+
private createBatches<T>(items: T[], batchSize: number): T[][] {
|
|
1243
|
+
const batches: T[][] = [];
|
|
1244
|
+
for (let i = 0; i < items.length; i += batchSize) {
|
|
1245
|
+
batches.push(items.slice(i, i + batchSize));
|
|
1246
|
+
}
|
|
1247
|
+
return batches;
|
|
1248
|
+
}
|
|
1249
|
+
|
|
1250
|
+
/**
|
|
1251
|
+
* Log performance summary
|
|
1252
|
+
*/
|
|
1253
|
+
private logSummary(): void {
|
|
1254
|
+
console.log('\n═══════════════════════════════════════════════');
|
|
1255
|
+
console.log(' INGESTION PERFORMANCE SUMMARY ');
|
|
1256
|
+
console.log('═══════════════════════════════════════════════');
|
|
1257
|
+
console.log(`Files Processed: ${this.metrics.filesProcessed}`);
|
|
1258
|
+
console.log(`Records Processed: ${this.metrics.recordsProcessed.toLocaleString()}`);
|
|
1259
|
+
console.log(`Batches Sent: ${this.metrics.batchesSent}`);
|
|
1260
|
+
console.log(`Errors: ${this.metrics.errors}`);
|
|
1261
|
+
console.log(`Total Time: ${(this.metrics.totalTimeMs / 1000).toFixed(2)}s`);
|
|
1262
|
+
console.log(`Avg Time Per File: ${(this.metrics.avgTimePerFile / 1000).toFixed(2)}s`);
|
|
1263
|
+
console.log(`Throughput: ${Math.round(this.metrics.avgRecordsPerSecond)} records/sec`);
|
|
1264
|
+
console.log(`Success Rate: ${((this.metrics.filesProcessed / (this.metrics.filesProcessed + this.metrics.errors)) * 100).toFixed(1)}%`);
|
|
1265
|
+
console.log('═══════════════════════════════════════════════\n');
|
|
1266
|
+
}
|
|
1267
|
+
}
|
|
1268
|
+
|
|
1269
|
+
// ===== Usage Example =====
|
|
1270
|
+
|
|
1271
|
+
async function main() {
|
|
1272
|
+
// Create optimized pipeline
|
|
1273
|
+
const pipeline = new OptimizedIngestionPipeline({
|
|
1274
|
+
fluent: {
|
|
1275
|
+
baseUrl: process.env.FLUENT_BASE_URL!,
|
|
1276
|
+
clientId: process.env.FLUENT_CLIENT_ID!,
|
|
1277
|
+
clientSecret: process.env.FLUENT_CLIENT_SECRET!,
|
|
1278
|
+
retailerId: process.env.FLUENT_RETAILER_ID!
|
|
1279
|
+
},
|
|
1280
|
+
s3: {
|
|
1281
|
+
region: process.env.AWS_REGION!,
|
|
1282
|
+
credentials: {
|
|
1283
|
+
accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
|
|
1284
|
+
secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!
|
|
1285
|
+
}
|
|
1286
|
+
},
|
|
1287
|
+
mapping: {
|
|
1288
|
+
fields: {
|
|
1289
|
+
ref: { source: 'sku', required: true },
|
|
1290
|
+
productRef: { source: 'product_id', required: true },
|
|
1291
|
+
locationRef: { source: 'warehouse_code', required: true },
|
|
1292
|
+
qty: { source: 'quantity', resolver: 'sdk.parseInt', required: true },
|
|
1293
|
+
type: { source: 'inventory_type', default: 'ON_HAND' },
|
|
1294
|
+
status: { source: 'status', default: 'AVAILABLE' }
|
|
1295
|
+
}
|
|
1296
|
+
},
|
|
1297
|
+
kv: openKv(), // Versori KV or custom implementation
|
|
1298
|
+
concurrency: 5, // 5 parallel workers
|
|
1299
|
+
enableStreaming: true, // Use streaming for memory efficiency
|
|
1300
|
+
enableAdaptiveSizing: true // Auto-adjust batch sizes
|
|
1301
|
+
});
|
|
1302
|
+
|
|
1303
|
+
// Initialize pipeline
|
|
1304
|
+
await pipeline.initialize();
|
|
1305
|
+
|
|
1306
|
+
// Run ingestion
|
|
1307
|
+
const metrics = await pipeline.run(
|
|
1308
|
+
'inventory-bucket',
|
|
1309
|
+
'data/inventory/'
|
|
1310
|
+
);
|
|
1311
|
+
|
|
1312
|
+
console.log('Pipeline completed successfully');
|
|
1313
|
+
process.exit(0);
|
|
1314
|
+
}
|
|
1315
|
+
|
|
1316
|
+
// Run if executed directly
|
|
1317
|
+
if (require.main === module) {
|
|
1318
|
+
main().catch(error => {
|
|
1319
|
+
console.error('Fatal error:', error);
|
|
1320
|
+
process.exit(1);
|
|
1321
|
+
});
|
|
1322
|
+
}
|
|
1323
|
+
```
|
|
1324
|
+
|
|
1325
|
+
**What This Example Demonstrates:**
|
|
1326
|
+
|
|
1327
|
+
- ✅ **Parallel processing** - 5 concurrent workers process files simultaneously
|
|
1328
|
+
- ✅ **Adaptive batch sizing** - Automatically adjusts batch size based on performance
|
|
1329
|
+
- ✅ **Streaming support** - Memory-efficient processing for large files
|
|
1330
|
+
- ✅ **State management** - Prevents duplicate file processing
|
|
1331
|
+
- ✅ **Comprehensive metrics** - Detailed performance tracking and reporting
|
|
1332
|
+
- ✅ **Error isolation** - One file failure doesn't stop the entire pipeline
|
|
1333
|
+
- ✅ **Production-ready** - Includes logging, error handling, and monitoring
|
|
1334
|
+
|
|
1335
|
+
## Key Takeaways
|
|
1336
|
+
|
|
1337
|
+
- 🎯 **Batch size** - 1000-2000 records is optimal for most use cases
|
|
1338
|
+
- 🎯 **Parallel processing** - Use 3-5 workers for best throughput
|
|
1339
|
+
- 🎯 **DAILY strategy** - Best for frequent small updates
|
|
1340
|
+
- 🎯 **PER_FILE strategy** - Best for large independent files
|
|
1341
|
+
- 🎯 **Streaming** - Essential for files > 100 MB
|
|
1342
|
+
|
|
1343
|
+
## Next Steps
|
|
1344
|
+
|
|
1345
|
+
Continue to [Module 9: Best Practices →](./02-core-guides-ingestion-09-best-practices.md) to learn about comprehensive error handling, monitoring strategies, security best practices, and production-ready deployment patterns.
|
|
1346
|
+
|
|
1347
|
+
---
|
|
1348
|
+
|
|
1349
|
+
[← Back to Ingestion Guide](../ingestion-readme.md) | [Previous: Module 7 - State Management](./02-core-guides-ingestion-07-state-management.md) | [Next: Module 9 - Best Practices →](./02-core-guides-ingestion-09-best-practices.md)
|