npm - @fluentcommerce/fc-connect-sdk - Versions diffs - 0.1.53 → 0.1.55 - Mend

@fluentcommerce/fc-connect-sdk 0.1.53 → 0.1.55

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (495) hide show

package/docs/01-TEMPLATES/versori/workflows/extraction/extraction-modes-guide.md CHANGED Viewed

@@ -1,1038 +1,1038 @@
-# Extraction Modes Guide
-**FC Connect SDK - Choosing the Right Extraction Mode**
-> This guide explains the two extraction modes available in the SDK and when to use each one.
----
-## Overview
-The FC Connect SDK supports two extraction modes for GraphQL queries:
-1. **Incremental** - Extract only changed records since last run (recommended for scheduled workflows)
-2. **Date Range** - Extract records within specific date window (for ad-hoc queries, backfills, or historical data)
-**Note:** "Historical extraction" is simply Date Range mode with a very old start date (e.g., `updatedAfter: '1970-01-01'`). There is no separate "historical mode" - use Date Range with appropriate validation and safeguards.
----
-## Quick Comparison Table
-| Mode            | Use Case                                      | Safety Level | Recommended Frequency | Typical Records | Production Risk                     |
-| --------------- | --------------------------------------------- | ------------ | --------------------- | --------------- | ----------------------------------- |
-| **Incremental** | Scheduled syncs, recurring extractions        | ✅ **Safe**  | Hourly/Daily          | ~10k            | ✅ **Low** - Safe for production    |
-| **Date Range**  | Ad-hoc audits, backfills, historical dumps    | ⚠️ **Risky** | One-time only         | ~50k            | ⚠️ **Medium** - Requires validation |
----
-## Mode 1: Incremental (Recommended) ✅
-### What It Does
-Extracts only records updated since the last successful extraction run.
-### How It Works
-```typescript
-const startTime = Date.now();
-log.info('📦 Starting incremental extraction');
-// Load last run timestamp from state
-const lastRunTime = await kv.get(['extraction', 'products', 'lastRunTime']);
-log.info('⏱️ Last run timestamp loaded', { lastRunTime });
-// Query with buffered timestamp (overlap buffer prevents gaps)
-const bufferedTime = new Date(lastRunTime - 60000).toISOString(); // 60 second buffer
-const result = await client.graphql({
-  query: PRODUCTS_QUERY,
-  variables: {
-    retailerId: 'my-retailer',
-    updatedAfter: bufferedTime, // Only changed records
-    first: 200,
-  },
-  pagination: { maxRecords: 10000 },
-});
-log.info('✅ Extraction complete', {
-  recordCount: result.edges.length,
-  duration: Date.now() - startTime
-});
-// Save new timestamp after successful extraction
-await kv.set(['extraction', 'products', 'lastRunTime'], {
-  timestamp: new Date().toISOString(),
-});
-```
-### When to Use
-- ✅ **Scheduled extractions** (hourly, daily, every 15 minutes)
-- ✅ **Real-time inventory feeds**
-- ✅ **Order status updates**
-- ✅ **Product catalog syncs**
-- ✅ **Any recurring extraction workflow**
-### Safety Features
-- Natural rate limiting via timestamps
-- Overlap buffer prevents missed records
-- State tracking prevents reprocessing
-- Predictable record counts
-- Automatic recovery after failures
-### Recommended Settings
-| Entity                   | Frequency    | Max Records | Expected Volume    |
-| ------------------------ | ------------ | ----------- | ------------------ |
-| **Products**             | Daily        | 10,000      | 5k-10k updates/day |
-| **Orders**               | Hourly       | 5,000       | 100-500/hour       |
-| **Fulfillments**         | Hourly       | 5,000       | 100-500/hour       |
-| **Inventory Quantities** | Every 15 min | 10,000      | 500-2k per run     |
-| **Inventory Positions**  | Daily        | 20,000      | 10k-20k per day    |
-| **Virtual Positions**    | Hourly       | 10,000      | 1k-5k per hour     |
-### Configuration Example
-```json
-{
-  "extractionMode": "incremental",
-  "pageSize": 200,
-  "maxRecords": 10000,
-  "overlapBufferSeconds": 60,
-  "fallbackStartDate": "2024-01-01T00:00:00Z"
-}
-```
----
-## Mode 2: Date Range (Use Carefully) ⚠️
-### What It Does
-Extracts all records updated within a specific date window.
-### How It Works
-```typescript
-const startTime = Date.now();
-log.info('📦 Starting date range extraction');
-// User provides specific date range
-const startDate = ctx.activation?.getVariable('startDate'); // "2025-01-01T00:00:00Z"
-const endDate = ctx.activation?.getVariable('endDate'); // "2025-01-31T23:59:59Z"
-log.info('⏱️ Date range configured', { startDate, endDate });
-const result = await client.graphql({
-  query: PRODUCTS_QUERY,
-  variables: {
-    retailerId: 'my-retailer',
-    updatedAfter: startDate,
-    updatedBefore: endDate, // Bounded range
-    first: 200,
-  },
-  pagination: { maxRecords: 50000 },
-});
-log.info('✅ Extraction complete', {
-  recordCount: result.edges.length,
-  duration: Date.now() - startTime
-});
-```
-### When to Use
-- ⚠️ **One-time data audits** (e.g., "show me all orders from Q4 2024")
-- ⚠️ **Backfilling missed extractions** (e.g., extraction failed for 3 days)
-- ⚠️ **Historical reporting** (e.g., "export last month's inventory changes")
-- ⚠️ **Data quality investigations**
-### Risks
-- Can return tens of thousands of records
-- No natural rate limiting
-- Easy to exceed platform limits
-- Can timeout on large datasets
-- Risk of platform overload
-### Safety Guardrails Required
-```typescript
-// ✅ REQUIRED: Validate date range before extraction
-function validateDateRange(startDate: string, endDate: string, log: any) {
-  const start = new Date(startDate);
-  const end = new Date(endDate);
-  const daysDiff = (end.getTime() - start.getTime()) / (1000 * 60 * 60 * 24);
-  log.info('🔍 Validating date range', { startDate, endDate, daysDiff });
-  // Enforce maximum date range
-  const MAX_DAYS = 30;
-  if (daysDiff > MAX_DAYS) {
-    log.error('❌ Date range validation failed', {
-      daysDiff,
-      maxDays: MAX_DAYS,
-      recommendation: 'Split into smaller ranges or use incremental mode'
-    });
-    throw new Error(
-      `Date range too large: ${daysDiff} days (max: ${MAX_DAYS}). ` +
-        `Split into smaller ranges or use incremental mode.`
-    );
-  }
-  log.info('✅ Date range validation passed', { daysDiff, maxDays: MAX_DAYS });
-  return { daysDiff, isValid: true };
-}
-```
-### Recommended Limits
-| Entity           | Max Date Range | Max Records | File Splitting    |
-| ---------------- | -------------- | ----------- | ----------------- |
-| **Products**     | 30 days        | 50,000      | Required if > 25k |
-| **Orders**       | 7 days         | 20,000      | Required if > 10k |
-| **Fulfillments** | 7 days         | 20,000      | Required if > 10k |
-| **Inventory**    | 14 days        | 50,000      | Required if > 25k |
-### Configuration Example
-```json
-{
-  "extractionMode": "dateRange",
-  "startDate": "2025-01-01T00:00:00Z",
-  "endDate": "2025-01-31T23:59:59Z",
-  "pageSize": 200,
-  "maxRecords": 50000,
-  "validateDateRange": true,
-  "maxDaysAllowed": 30
-}
-```
-### Production Restrictions
-- ❌ **Never schedule** date range extractions
-- ❌ **Never use in recurring workflows**
-- ✅ **Always validate date range** before execution
-- ✅ **Always monitor execution time** and record counts
-- ✅ **Always implement file splitting** for large results
-- ✅ **Always get approval** before running on production data
----
-## File Splitting for Large Extractions
-### Overview
-When ExtractionOrchestrator fetches large datasets (e.g., 50,000 records), all records are loaded into memory. Before writing to destination (SFTP/S3), you can split these records into multiple smaller files for better manageability, parallel processing, and downstream system compatibility.
-### When to Split Files
-- ✅ **Extraction returns >10k records** - Split into manageable chunks
-- ✅ **Downstream system has file size limits** - Respect partner constraints
-- ✅ **Parallel processing desired** - Write multiple files concurrently
-- ✅ **Network reliability concerns** - Smaller files = better retry granularity
-- ✅ **Audit trail requirements** - Track processing per file
-### Basic File Splitting Pattern
-```typescript
-import { Buffer } from 'node:buffer';
-import { ExtractionOrchestrator, SftpDataSource } from '@fluentcommerce/fc-connect-sdk';
-const startTime = Date.now();
-log.info('📦 Starting extraction with file splitting');
-// STEP 1: Extract all records (loads into memory)
-const result = await orchestrator.extract({
-  query: virtualPositionsQuery,
-  variables: { retailerId: 'my-retailer', updatedAfter: lastRunTime },
-  pagination: { pageSize: 200, maxRecords: 50000 },
-});
-log.info('✅ Extraction complete', {
-  recordCount: result.data.length,
-  duration: Date.now() - startTime
-});
-// result.data contains all 50,000 records in memory
-// STEP 2: Configure chunking
-const RECORDS_PER_FILE = 1000; // Configurable chunk size
-const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
-// STEP 3: Split records into chunks
-const chunks: any[][] = [];
-for (let i = 0; i < result.data.length; i += RECORDS_PER_FILE) {
-  chunks.push(result.data.slice(i, i + RECORDS_PER_FILE));
-}
-log.info('📂 File splitting configured', {
-  totalRecords: result.data.length,
-  recordsPerFile: RECORDS_PER_FILE,
-  filesToCreate: chunks.length,
-});
-// STEP 4: Generate files with configurable naming pattern
-const filePromises = chunks.map((chunk, index) => {
-  // File naming pattern: entity-timestamp-part-NNN.format
-  const partNumber = String(index + 1).padStart(3, '0'); // 001, 002, 003...
-  const filename = `virtual-positions-${timestamp}-part-${partNumber}.xml`;
-  // Transform chunk to desired format (XML, CSV, JSON)
-  const xmlContent = xmlBuilder.build({ records: chunk });
-  // Return upload promise
-  return sftp.uploadFile(
-    `/outbound/${filename}`,
-    Buffer.from(xmlContent, 'utf8'),
-    { encoding: 'utf8', overwrite: false }
-  );
-});
-// STEP 5: Write all files in parallel
-const uploadStartTime = Date.now();
-await Promise.all(filePromises);
-log.info('✅ File splitting complete', {
-  filesCreated: chunks.length,
-  totalRecords: result.data.length,
-  recordsPerFile: RECORDS_PER_FILE,
-  uploadDuration: Date.now() - uploadStartTime,
-  totalDuration: Date.now() - startTime
-});
-```
-### Configurable File Naming Patterns
-Support multiple naming patterns via configuration:
-```typescript
-// Configuration interface
-interface FileSplittingConfig {
-  enabled: boolean;
-  recordsPerFile: number;
-  namingPattern: 'sequential' | 'timestamp' | 'range' | 'custom';
-  customPattern?: (index: number, chunk: any[], total: number) => string;
-}
-// Pattern examples
-const patterns = {
-  // Pattern 1: Sequential numbering
-  sequential: (index: number) =>
-    `virtual-positions-part-${String(index + 1).padStart(3, '0')}.xml`,
-  // Pattern 2: Timestamp-based
-  timestamp: (index: number) =>
-    `virtual-positions-${new Date().toISOString()}-${index + 1}.xml`,
-  // Pattern 3: Record range
-  range: (index: number, chunk: any[]) => {
-    const start = index * chunk.length + 1;
-    const end = start + chunk.length - 1;
-    return `virtual-positions-records-${start}-${end}.xml`;
-  },
-  // Pattern 4: Custom with metadata
-  custom: (index: number, chunk: any[], totalChunks: number) =>
-    `VP_${new Date().toISOString().split('T')[0]}_${index + 1}_of_${totalChunks}.xml`,
-};
-```
-### Complete Example: Extraction with File Splitting
-```typescript
-import { schedule, fn, MemoryInterpreter } from '@versori/run';
-import { Buffer } from 'node:buffer';
-import {
-  ExtractionOrchestrator,
-  createClient,
-  SftpDataSource,
-  VersoriKVAdapter,
-  XMLBuilder,
-} from '@fluentcommerce/fc-connect-sdk';
-export const virtualPositionsExtraction = schedule(
-  'virtual-positions-hourly',
-  '0 * * * *'
-).then(
-  fn('extract', async (ctx) => {
-    const { log, activation, openKv } = ctx;
-    const startTime = Date.now();
-    log.info('📦 Starting virtual positions extraction');
-    // Configuration from activation variables
-    const FILE_SPLITTING_ENABLED = activation.getVariable('fileSplittingEnabled') !== 'false';
-    const RECORDS_PER_FILE = parseInt(activation.getVariable('recordsPerFile') || '1000', 10);
-    const FILE_NAMING_PATTERN = activation.getVariable('fileNamingPattern') || 'sequential';
-    try {
-      // Initialize services
-      const client = await createClient(ctx);
-      log.info('✅ Client initialized');
-      const kv = new VersoriKVAdapter(openKv(':project:'));
-      const orchestrator = new ExtractionOrchestrator(client, log);
-      // Extract records (all loaded into memory)
-      const extractionStartTime = Date.now();
-      const result = await orchestrator.extract({
-        query: virtualPositionsQuery,
-        variables: {
-          retailerId: activation.getVariable('fluentRetailerId'),
-          updatedAfter: await kv.get(['extraction', 'lastRunTime']) || '2025-01-01T00:00:00Z',
-        },
-        pagination: { pageSize: 200, maxRecords: 50000 },
-      });
-      log.info('✅ Extraction complete', {
-        recordCount: result.data.length,
-        fileSplittingEnabled: FILE_SPLITTING_ENABLED,
-        extractionDuration: Date.now() - extractionStartTime
-      });
-      // Initialize SFTP
-      const sftp = new SftpDataSource({
-        type: 'SFTP_XML',
-        connectionId: 'sftp-extractions',
-        name: 'extraction-sftp',
-        settings: {
-          host: activation.getVariable('sftpHost'),
-          port: parseInt(activation.getVariable('sftpPort') || '22', 10),
-          username: activation.getVariable('sftpUsername'),
-          password: activation.getVariable('sftpPassword'),
-          remotePath: '/outbound/',
-          requireAbsolutePaths: true,
-        },
-      }, log);
-      await sftp.validateConnection();
-      log.info('✅ SFTP connection validated');
-      // File splitting logic
-      if (FILE_SPLITTING_ENABLED && result.data.length > RECORDS_PER_FILE) {
-        const splittingStartTime = Date.now();
-        // Split into chunks
-        const chunks: any[][] = [];
-        for (let i = 0; i < result.data.length; i += RECORDS_PER_FILE) {
-          chunks.push(result.data.slice(i, i + RECORDS_PER_FILE));
-        }
-        log.info('📂 Splitting extraction into multiple files', {
-          totalRecords: result.data.length,
-          recordsPerFile: RECORDS_PER_FILE,
-          filesToCreate: chunks.length,
-        });
-        // Generate files
-        const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
-        const xmlBuilder = new XMLBuilder();
-        const filePromises = chunks.map(async (chunk, index) => {
-          // Generate filename based on pattern
-          let filename: string;
-          const partNumber = String(index + 1).padStart(3, '0');
-          switch (FILE_NAMING_PATTERN) {
-            case 'timestamp':
-              filename = `virtual-positions-${timestamp}-${partNumber}.xml`;
-              break;
-            case 'range':
-              const start = index * RECORDS_PER_FILE + 1;
-              const end = Math.min(start + chunk.length - 1, result.data.length);
-              filename = `virtual-positions-records-${start}-${end}.xml`;
-              break;
-            case 'sequential':
-            default:
-              filename = `virtual-positions-part-${partNumber}.xml`;
-              break;
-          }
-          // Build XML content
-          const xmlContent = xmlBuilder.build({ virtualPositions: chunk });
-          // Upload to SFTP
-          await sftp.uploadFile(
-            `/outbound/${filename}`,
-            Buffer.from(xmlContent, 'utf8'),
-            { encoding: 'utf8', overwrite: false }
-          );
-          return { filename, recordCount: chunk.length };
-        });
-        // Write all files in parallel
-        const uploadedFiles = await Promise.all(filePromises);
-        log.info('✅ File splitting complete', {
-          filesCreated: uploadedFiles.length,
-          totalRecords: result.data.length,
-          files: uploadedFiles,
-          splittingDuration: Date.now() - splittingStartTime,
-          totalDuration: Date.now() - startTime
-        });
-        return {
-          success: true,
-          extractionMode: 'incremental',
-          totalRecords: result.data.length,
-          filesCreated: uploadedFiles.length,
-          files: uploadedFiles,
-          duration: Date.now() - startTime
-        };
-      } else {
-        // Single file (no splitting)
-        const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
-        const filename = `virtual-positions-${timestamp}.xml`;
-        const xmlBuilder = new XMLBuilder();
-        const xmlContent = xmlBuilder.build({ virtualPositions: result.data });
-        await sftp.uploadFile(
-          `/outbound/${filename}`,
-          Buffer.from(xmlContent, 'utf8'),
-          { encoding: 'utf8', overwrite: false }
-        );
-        log.info('✅ Single file extraction complete', {
-          filename,
-          recordCount: result.data.length,
-          duration: Date.now() - startTime
-        });
-        return {
-          success: true,
-          extractionMode: 'incremental',
-          totalRecords: result.data.length,
-          filesCreated: 1,
-          files: [{ filename, recordCount: result.data.length }],
-          duration: Date.now() - startTime
-        };
-      }
-    } catch (error: any) {
-      log.error('❌ Extraction failed', {
-        message: error instanceof Error ? error.message : String(error),
-        stack: error instanceof Error ? error.stack : undefined,
-        errorType: error instanceof Error ? error.constructor.name : 'Error',
-        duration: Date.now() - startTime
-      });
-      return {
-        success: false,
-        error: error.message,
-        duration: Date.now() - startTime
-      };
-    } finally {
-      if (sftp) {
-        await sftp.dispose();
-        log.info('✅ SFTP connection disposed');
-      }
-    }
-  })
-);
-// Export with MemoryInterpreter for Versori platform
-export const interpreter = new MemoryInterpreter({
-  workflows: [virtualPositionsExtraction]
-});
-```
-### Date Range Splitting for Very Large Datasets
-For extractions spanning multiple years (e.g., historical dumps), split the date range and run multiple smaller extractions:
-```typescript
-// Split large date range into manageable chunks
-async function extractHistoricalData(
-  startDate: string,  // "2020-01-01"
-  endDate: string,    // "2025-01-01"
-  chunkMonths: number = 1,  // Extract 1 month at a time
-  log: any
-) {
-  const overallStartTime = Date.now();
-  const results = [];
-  let currentStart = new Date(startDate);
-  const finalEnd = new Date(endDate);
-  log.info('📦 Starting historical data extraction', {
-    startDate,
-    endDate,
-    chunkMonths,
-    estimatedChunks: Math.ceil((finalEnd.getTime() - currentStart.getTime()) / (chunkMonths * 30 * 24 * 60 * 60 * 1000))
-  });
-  while (currentStart < finalEnd) {
-    const chunkStartTime = Date.now();
-    // Calculate chunk end date
-    const currentEnd = new Date(currentStart);
-    currentEnd.setMonth(currentEnd.getMonth() + chunkMonths);
-    // Ensure we don't exceed final end date
-    if (currentEnd > finalEnd) {
-      currentEnd.setTime(finalEnd.getTime());
-    }
-    log.info('📂 Extracting date range chunk', {
-      chunkNumber: results.length + 1,
-      startDate: currentStart.toISOString(),
-      endDate: currentEnd.toISOString(),
-    });
-    // Extract for this date range
-    const result = await orchestrator.extract({
-      query: virtualPositionsQuery,
-      variables: {
-        retailerId: 'my-retailer',
-        updatedAfter: currentStart.toISOString(),
-        updatedBefore: currentEnd.toISOString(),
-      },
-      pagination: { pageSize: 200, maxRecords: 50000 },
-    });
-    log.info('✅ Chunk extraction complete', {
-      chunkNumber: results.length + 1,
-      recordCount: result.data.length,
-      chunkDuration: Date.now() - chunkStartTime
-    });
-    // Split this chunk into files if needed
-    if (result.data.length > RECORDS_PER_FILE) {
-      const chunks: any[][] = [];
-      for (let i = 0; i < result.data.length; i += RECORDS_PER_FILE) {
-        chunks.push(result.data.slice(i, i + RECORDS_PER_FILE));
-      }
-      // Write chunks in parallel
-      const timestamp = currentStart.toISOString().split('T')[0];
-      await Promise.all(
-        chunks.map((chunk, index) => {
-          const filename = `historical-${timestamp}-part-${String(index + 1).padStart(3, '0')}.xml`;
-          return sftp.uploadFile(`/outbound/${filename}`, Buffer.from(xmlBuilder.build(chunk), 'utf8'));
-        })
-      );
-    }
-    results.push({
-      startDate: currentStart.toISOString(),
-      endDate: currentEnd.toISOString(),
-      recordCount: result.data.length,
-      duration: Date.now() - chunkStartTime
-    });
-    // Move to next chunk
-    currentStart = new Date(currentEnd);
-    // Rate limiting - wait between chunks
-    log.info('⏱️ Rate limiting delay (5 seconds)');
-    await new Promise(resolve => setTimeout(resolve, 5000)); // 5 second delay
-  }
-  const totalRecords = results.reduce((sum, r) => sum + r.recordCount, 0);
-  const totalDuration = Date.now() - overallStartTime;
-  log.info('✅ Historical extraction complete', {
-    totalChunks: results.length,
-    totalRecords,
-    totalDuration,
-    averageRecordsPerChunk: Math.round(totalRecords / results.length)
-  });
-  return results;
-}
-// Usage: Extract 5 years of data, 1 month at a time
-const historicalResults = await extractHistoricalData('2020-01-01', '2025-01-01', 1, log);
-```
-### Configuration Variables for File Splitting
-Add these to Versori activation variables:
-```bash
-# File Splitting Configuration
-FILE_SPLITTING_ENABLED=true              # Enable/disable file splitting
-RECORDS_PER_FILE=1000                    # Records per file (default: 1000)
-FILE_NAMING_PATTERN=sequential           # Naming pattern: sequential | timestamp | range
-MAX_RECORDS_PER_EXTRACTION=50000         # Maximum records to extract in one run
-# Date Range Splitting (for large historical extractions)
-DATE_RANGE_CHUNK_MONTHS=1                # Months per extraction chunk (default: 1)
-RATE_LIMIT_DELAY_MS=5000                 # Delay between chunks (milliseconds)
-```
-### Benefits of File Splitting
-- ✅ **Memory efficiency** - Process large datasets without memory overflow
-- ✅ **Parallel writes** - Multiple files written concurrently (faster)
-- ✅ **Better error recovery** - Retry individual files vs entire extraction
-- ✅ **Downstream compatibility** - Honor partner file size limits
-- ✅ **Audit granularity** - Track processing per file
-- ✅ **Network resilience** - Smaller files = better upload success rate
-### Limitations
-- All records must fit in memory during extraction (ExtractionOrchestrator loads all pages)
-- File splitting happens post-extraction (not during pagination)
-- Parallel writes limited by available memory and network bandwidth
-- SFTP connection pool size may limit concurrency (default: 10 connections)
----
-## Historical Data Extraction via Date Range
-### Overview
-There is **no separate "historical mode"** in the SDK. To extract historical data (e.g., all records from 2020 onwards), use **Date Range mode** with a very old start date and appropriate safeguards.
-### How to Extract Historical Data
-```typescript
-// Historical extraction is just Date Range with old start date
-await client.graphql({
-  query: PRODUCTS_QUERY,
-  variables: {
-    retailerId: 'my-retailer',
-    updatedAfter: '1970-01-01T00:00:00Z',  // Very old date = "all records"
-    updatedBefore: new Date().toISOString(), // Up to now
-    first: 200,
-  },
-  pagination: { maxRecords: 50000 },
-});
-```
-### Required Safeguards for Historical Extraction
-1. **Date Range Splitting** - Split into smaller chunks (monthly/quarterly)
-2. **File Splitting** - Split large results into multiple files
-3. **Rate Limiting** - Add delays between chunks
-4. **Validation** - Verify date ranges before execution
-5. **Monitoring** - Track progress and alert on anomalies
-6. **Approval** - Get sign-off before running on production
-### Recommended Approach: Chunked Date Range Extraction
-```bash
-#!/bin/bash
-# Safe historical extraction via chunked date ranges
-START_DATE="2020-01-01"
-END_DATE="2025-01-01"
-# Extract one month at a time
-current=$START_DATE
-while [[ "$current" < "$END_DATE" ]]; do
-  # Calculate month end
-  monthEnd=$(date -d "$current + 1 month" +%Y-%m-%d)
-  echo "Extracting $current to $monthEnd..."
-  # Trigger date range extraction
-  curl -X POST https://versori-webhook.com/extract \
-    -H "Content-Type: application/json" \
-    -d "{
-      \"extractionMode\": \"dateRange\",
-      \"startDate\": \"${current}T00:00:00Z\",
-      \"endDate\": \"${monthEnd}T00:00:00Z\",
-      \"fileSplittingEnabled\": true,
-      \"recordsPerFile\": 1000
-    }"
-  # Rate limiting - wait 60 seconds between chunks
-  sleep 60
-  # Move to next month
-  current=$monthEnd
-done
-echo "Historical extraction complete"
-```
-### Migration to Incremental After Historical Load
-After completing a one-time historical extraction, switch to incremental mode for ongoing syncs:
-```json
-{
-  "extractionMode": "incremental",
-  "fallbackStartDate": "2025-01-22T00:00:00Z",
-  "pageSize": 200,
-  "maxRecords": 10000
-}
-```
----
-## Decision Tree: Which Mode to Use?
-```
-Start Here
-    │
-    ├─ Need recurring extractions? ─────────► Use INCREMENTAL ✅
-    │                                          (hourly/daily/every 15 min)
-    │                                          Tracks state, auto-recovery
-    │
-    └─ Need specific date range? ────────────► Use DATE RANGE ⚠️
-         │                                      (one-time, validate range)
-         │
-         ├─ Date range < 30 days? ──────────► Single DATE RANGE run
-         │                                     + file splitting if >10k records
-         │
-         └─ Date range > 30 days? ──────────► Split into monthly chunks
-              │                               + file splitting per chunk
-              │                               + rate limiting between chunks
-              │
-              └─ Historical (all data)? ────► DATE RANGE with old start date
-                                              + chunked approach (monthly)
-                                              + approval required
-```
----
-## Monitoring & Alerts
-Set up alerts for extraction volumes:
-```typescript
-// In extraction workflow
-const recordCount = edges.length;
-const ALERT_THRESHOLD = 50000;
-log.info('📊 Checking extraction volume', { recordCount, threshold: ALERT_THRESHOLD });
-if (recordCount > ALERT_THRESHOLD) {
-  log.error('❌ Extraction volume exceeded threshold', {
-    recordCount,
-    threshold: ALERT_THRESHOLD,
-    percentageOver: Math.round(((recordCount - ALERT_THRESHOLD) / ALERT_THRESHOLD) * 100),
-    mode: extractionMode,
-    recommendation: 'Switch to incremental mode or reduce date range',
-  });
-  // Send alert to monitoring system
-  await sendAlert({
-    severity: 'high',
-    message: `Extraction returned ${recordCount} records (threshold: ${ALERT_THRESHOLD})`,
-    mode: extractionMode,
-  });
-  log.info('✅ Alert sent to monitoring system');
-} else {
-  log.info('✅ Extraction volume within acceptable limits', {
-    recordCount,
-    threshold: ALERT_THRESHOLD,
-    percentageUsed: Math.round((recordCount / ALERT_THRESHOLD) * 100)
-  });
-}
-```
----
-## Summary & Best Practices
-### ✅ DO
-- Use **incremental mode** for all scheduled extractions
-- Validate date ranges before running **dateRange mode**
-- Implement **file splitting** for large extractions (>10k records)
-- Use **parallel writes** with `Promise.all()` for split files
-- Monitor extraction volumes and set alerts
-- Test on staging before production
-- Use overlap buffer (60s) to prevent gaps in incremental mode
-- Track state with VersoriKV
-- Split large historical extractions into monthly chunks
-- Add rate limiting between extraction chunks
-### ❌ DON'T
-- Schedule **dateRange** extractions (incremental only)
-- Use date ranges > 30 days without chunking
-- Skip validation checks for date ranges
-- Ignore volume alerts (>50k records)
-- Forget to implement file splitting for large results
-- Run historical extractions without approval and monitoring
----
-## ExtractionOrchestrator (SDK v0.1.27+)
-The SDK includes **ExtractionOrchestrator** - a high-level service that simplifies extraction workflows with built-in mode handling, pagination, and output management.
-### Why Use ExtractionOrchestrator?
-**Instead of manually implementing:**
-- Mode detection (incremental/dateRange)
-- Pagination loops
-- Path-based field extraction
-- Output formatting (CSV/JSON/Parquet)
-- S3/SFTP uploads
-- Error handling
-**ExtractionOrchestrator handles it all:**
-```typescript
-import { ExtractionOrchestrator, createClient } from '@fluentcommerce/fc-connect-sdk';
-const startTime = Date.now();
-log.info('📦 Initializing ExtractionOrchestrator');
-const client = await createClient(ctx);
-const orchestrator = new ExtractionOrchestrator(client, log);
-log.info('✅ Orchestrator initialized');
-// Both modes supported with single interface
-const result = await orchestrator.extract({
-  query: virtualPositionsQuery,
-  variables: { retailerId: 'my-retailer' },
-  // Mode: 'incremental' (scheduled) or 'dateRange' (ad-hoc)
-  extractionMode: 'incremental',
-  stateKey: 'virtual-positions-extraction',
-  // Pagination handled automatically
-  pagination: {
-    pageSize: 200,
-    maxRecords: 10000,
-  },
-  // Output format and destination
-  outputFormat: 'csv',
-  outputDestination: {
-    type: 's3',
-    bucket: 'my-extracts',
-    key: 'virtual-positions/hourly/{{timestamp}}.csv',
-  },
-  // Field extraction from nested paths
-  fieldPaths: {
-    position_ref: 'ref',
-    quantity: 'quantity',
-    location_ref: 'locationLink.ref',
-    location_name: 'locationLink.name',
-  },
-});
-log.info('✅ Extraction and upload complete', {
-  recordCount: result.recordCount,
-  outputFile: result.outputFile,
-  duration: Date.now() - startTime
-});
-```
-### Features
-- **Auto-pagination**: Handles cursor-based pagination automatically
-- **Mode support**: Both modes (incremental, dateRange)
-- **State management**: Tracks last run timestamps for incremental mode
-- **Path extraction**: Extracts nested fields from GraphQL responses
-- **Multi-format**: Outputs CSV, JSON, or Parquet
-- **Validation**: Built-in query and response validation
-- **Error recovery**: Graceful failure handling with detailed logs
-- **File splitting**: Post-extraction chunking for large datasets (add manually, see File Splitting section above)
-### Example: Incremental Extraction with ExtractionOrchestrator
-```typescript
-import { schedule, fn, MemoryInterpreter } from '@versori/run';
-import { Buffer } from 'node:buffer';  // Required for Deno/Versori runtime
-import {
-  ExtractionOrchestrator,
-  createClient,
-  VersoriKVAdapter,
-} from '@fluentcommerce/fc-connect-sdk';
-export const hourlyExtraction = schedule('hourly-virtual-positions', '0 * * * *').then(
-  fn('extract', async (ctx) => {
-    const { log, openKv, env } = ctx;
-    const startTime = Date.now();
-    log.info('📦 Starting hourly virtual positions extraction');
-    const client = await createClient(ctx);
-    log.info('✅ Client initialized');
-    const kv = new VersoriKVAdapter(openKv(':project:'));
-    const orchestrator = new ExtractionOrchestrator(client, log);
-    const result = await orchestrator.extract({
-      query: `
-        query GetVirtualPositions($retailerId: String!, $updatedAfter: String, $first: Int, $after: String) {
-          virtualPositions(retailerId: $retailerId, updatedAfter: $updatedAfter, first: $first, after: $after) {
-            edges {
-              node {
-                ref
-                quantity
-                productRef
-                locationLink { ref name }
-                updatedOn
-              }
-              cursor
-            }
-            pageInfo {
-              hasNextPage
-              # Note: Fluent doesn't return endCursor - cursors are in edges[].cursor
-            }
-          }
-        }
-      `,
-      variables: { retailerId: env.FLUENT_RETAILER_ID },
-      extractionMode: 'incremental',
-      stateAdapter: kv,
-      stateKey: 'hourly-extraction',
-      fallbackStartDate: '2025-01-01T00:00:00Z',
-      pagination: { pageSize: 200, maxRecords: 10000 },
-      outputFormat: 'csv',
-      outputDestination: {
-        type: 's3',
-        bucket: env.S3_BUCKET,
-        key: `virtual-positions/{{date}}/{{timestamp}}.csv`,
-        config: {
-          accessKeyId: env.AWS_ACCESS_KEY_ID,
-          secretAccessKey: env.AWS_SECRET_ACCESS_KEY,
-          region: env.AWS_REGION,
-        },
-      },
-    });
-    log.info('✅ Extraction complete', {
-      success: result.success,
-      recordCount: result.recordCount,
-      outputFile: result.outputFile,
-      duration: Date.now() - startTime
-    });
-    return {
-      success: result.success,
-      recordCount: result.recordCount,
-      outputFile: result.outputFile,
-      duration: Date.now() - startTime
-    };
-  })
-);
-// Export with MemoryInterpreter for Versori platform
-export const interpreter = new MemoryInterpreter({
-  workflows: [hourlyExtraction]
-});
-```
-### When to Use ExtractionOrchestrator
-- **✅ Use for**: New extraction workflows, scheduled extractions, standard use cases
-- **⚠️ Manual approach**: Complex transformations, custom business logic, non-standard outputs
-See [ExtractionOrchestrator API Reference](../../../../02-CORE-GUIDES/extraction/modules/02-core-guides-extraction-08-extraction-orchestrator.md) for complete documentation.
----
-## See Also
-- [CLI Validation Workflow](../../../../02-CORE-GUIDES/api-reference/modules/api-reference-11-cli-tools.md) - Validate queries and mappings
-- [Production Safety Guide](../../../../02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-09-best-practices.md) - General safety practices
-- [GraphQL Query Examples](./graphql-queries/) - Sample extraction queries
-- [Universal Mapping Guide](../../../../02-CORE-GUIDES/advanced-services/advanced-services-readme.md) - Field mapping documentation
-- [ExtractionOrchestrator Examples](../../../../03-PATTERN-GUIDES/examples/test-data/03-PATTERN-GUIDES-readme.md) - Complete working examples
+# Extraction Modes Guide
+**FC Connect SDK - Choosing the Right Extraction Mode**
+> This guide explains the two extraction modes available in the SDK and when to use each one.
+---
+## Overview
+The FC Connect SDK supports two extraction modes for GraphQL queries:
+1. **Incremental** - Extract only changed records since last run (recommended for scheduled workflows)
+2. **Date Range** - Extract records within specific date window (for ad-hoc queries, backfills, or historical data)
+**Note:** "Historical extraction" is simply Date Range mode with a very old start date (e.g., `updatedAfter: '1970-01-01'`). There is no separate "historical mode" - use Date Range with appropriate validation and safeguards.
+---
+## Quick Comparison Table
+| Mode            | Use Case                                      | Safety Level | Recommended Frequency | Typical Records | Production Risk                     |
+| --------------- | --------------------------------------------- | ------------ | --------------------- | --------------- | ----------------------------------- |
+| **Incremental** | Scheduled syncs, recurring extractions        | ✅ **Safe**  | Hourly/Daily          | ~10k            | ✅ **Low** - Safe for production    |
+| **Date Range**  | Ad-hoc audits, backfills, historical dumps    | ⚠️ **Risky** | One-time only         | ~50k            | ⚠️ **Medium** - Requires validation |
+---
+## Mode 1: Incremental (Recommended) ✅
+### What It Does
+Extracts only records updated since the last successful extraction run.
+### How It Works
+```typescript
+const startTime = Date.now();
+log.info('📦 Starting incremental extraction');
+// Load last run timestamp from state
+const lastRunTime = await kv.get(['extraction', 'products', 'lastRunTime']);
+log.info('⏱️ Last run timestamp loaded', { lastRunTime });
+// Query with buffered timestamp (overlap buffer prevents gaps)
+const bufferedTime = new Date(lastRunTime - 60000).toISOString(); // 60 second buffer
+const result = await client.graphql({
+  query: PRODUCTS_QUERY,
+  variables: {
+    retailerId: 'my-retailer',
+    updatedAfter: bufferedTime, // Only changed records
+    first: 200,
+  },
+  pagination: { maxRecords: 10000 },
+});
+log.info('✅ Extraction complete', {
+  recordCount: result.edges.length,
+  duration: Date.now() - startTime
+});
+// Save new timestamp after successful extraction
+await kv.set(['extraction', 'products', 'lastRunTime'], {
+  timestamp: new Date().toISOString(),
+});
+```
+### When to Use
+- ✅ **Scheduled extractions** (hourly, daily, every 15 minutes)
+- ✅ **Real-time inventory feeds**
+- ✅ **Order status updates**
+- ✅ **Product catalog syncs**
+- ✅ **Any recurring extraction workflow**
+### Safety Features
+- Natural rate limiting via timestamps
+- Overlap buffer prevents missed records
+- State tracking prevents reprocessing
+- Predictable record counts
+- Automatic recovery after failures
+### Recommended Settings
+| Entity                   | Frequency    | Max Records | Expected Volume    |
+| ------------------------ | ------------ | ----------- | ------------------ |
+| **Products**             | Daily        | 10,000      | 5k-10k updates/day |
+| **Orders**               | Hourly       | 5,000       | 100-500/hour       |
+| **Fulfillments**         | Hourly       | 5,000       | 100-500/hour       |
+| **Inventory Quantities** | Every 15 min | 10,000      | 500-2k per run     |
+| **Inventory Positions**  | Daily        | 20,000      | 10k-20k per day    |
+| **Virtual Positions**    | Hourly       | 10,000      | 1k-5k per hour     |
+### Configuration Example
+```json
+{
+  "extractionMode": "incremental",
+  "pageSize": 200,
+  "maxRecords": 10000,
+  "overlapBufferSeconds": 60,
+  "fallbackStartDate": "2024-01-01T00:00:00Z"
+}
+```
+---
+## Mode 2: Date Range (Use Carefully) ⚠️
+### What It Does
+Extracts all records updated within a specific date window.
+### How It Works
+```typescript
+const startTime = Date.now();
+log.info('📦 Starting date range extraction');
+// User provides specific date range
+const startDate = ctx.activation?.getVariable('startDate'); // "2025-01-01T00:00:00Z"
+const endDate = ctx.activation?.getVariable('endDate'); // "2025-01-31T23:59:59Z"
+log.info('⏱️ Date range configured', { startDate, endDate });
+const result = await client.graphql({
+  query: PRODUCTS_QUERY,
+  variables: {
+    retailerId: 'my-retailer',
+    updatedAfter: startDate,
+    updatedBefore: endDate, // Bounded range
+    first: 200,
+  },
+  pagination: { maxRecords: 50000 },
+});
+log.info('✅ Extraction complete', {
+  recordCount: result.edges.length,
+  duration: Date.now() - startTime
+});
+```
+### When to Use
+- ⚠️ **One-time data audits** (e.g., "show me all orders from Q4 2024")
+- ⚠️ **Backfilling missed extractions** (e.g., extraction failed for 3 days)
+- ⚠️ **Historical reporting** (e.g., "export last month's inventory changes")
+- ⚠️ **Data quality investigations**
+### Risks
+- Can return tens of thousands of records
+- No natural rate limiting
+- Easy to exceed platform limits
+- Can timeout on large datasets
+- Risk of platform overload
+### Safety Guardrails Required
+```typescript
+// ✅ REQUIRED: Validate date range before extraction
+function validateDateRange(startDate: string, endDate: string, log: any) {
+  const start = new Date(startDate);
+  const end = new Date(endDate);
+  const daysDiff = (end.getTime() - start.getTime()) / (1000 * 60 * 60 * 24);
+  log.info('🔍 Validating date range', { startDate, endDate, daysDiff });
+  // Enforce maximum date range
+  const MAX_DAYS = 30;
+  if (daysDiff > MAX_DAYS) {
+    log.error('❌ Date range validation failed', {
+      daysDiff,
+      maxDays: MAX_DAYS,
+      recommendation: 'Split into smaller ranges or use incremental mode'
+    });
+    throw new Error(
+      `Date range too large: ${daysDiff} days (max: ${MAX_DAYS}). ` +
+        `Split into smaller ranges or use incremental mode.`
+    );
+  }
+  log.info('✅ Date range validation passed', { daysDiff, maxDays: MAX_DAYS });
+  return { daysDiff, isValid: true };
+}
+```
+### Recommended Limits
+| Entity           | Max Date Range | Max Records | File Splitting    |
+| ---------------- | -------------- | ----------- | ----------------- |
+| **Products**     | 30 days        | 50,000      | Required if > 25k |
+| **Orders**       | 7 days         | 20,000      | Required if > 10k |
+| **Fulfillments** | 7 days         | 20,000      | Required if > 10k |
+| **Inventory**    | 14 days        | 50,000      | Required if > 25k |
+### Configuration Example
+```json
+{
+  "extractionMode": "dateRange",
+  "startDate": "2025-01-01T00:00:00Z",
+  "endDate": "2025-01-31T23:59:59Z",
+  "pageSize": 200,
+  "maxRecords": 50000,
+  "validateDateRange": true,
+  "maxDaysAllowed": 30
+}
+```
+### Production Restrictions
+- ❌ **Never schedule** date range extractions
+- ❌ **Never use in recurring workflows**
+- ✅ **Always validate date range** before execution
+- ✅ **Always monitor execution time** and record counts
+- ✅ **Always implement file splitting** for large results
+- ✅ **Always get approval** before running on production data
+---
+## File Splitting for Large Extractions
+### Overview
+When ExtractionOrchestrator fetches large datasets (e.g., 50,000 records), all records are loaded into memory. Before writing to destination (SFTP/S3), you can split these records into multiple smaller files for better manageability, parallel processing, and downstream system compatibility.
+### When to Split Files
+- ✅ **Extraction returns >10k records** - Split into manageable chunks
+- ✅ **Downstream system has file size limits** - Respect partner constraints
+- ✅ **Parallel processing desired** - Write multiple files concurrently
+- ✅ **Network reliability concerns** - Smaller files = better retry granularity
+- ✅ **Audit trail requirements** - Track processing per file
+### Basic File Splitting Pattern
+```typescript
+import { Buffer } from 'node:buffer';
+import { ExtractionOrchestrator, SftpDataSource } from '@fluentcommerce/fc-connect-sdk';
+const startTime = Date.now();
+log.info('📦 Starting extraction with file splitting');
+// STEP 1: Extract all records (loads into memory)
+const result = await orchestrator.extract({
+  query: virtualPositionsQuery,
+  variables: { retailerId: 'my-retailer', updatedAfter: lastRunTime },
+  pagination: { pageSize: 200, maxRecords: 50000 },
+});
+log.info('✅ Extraction complete', {
+  recordCount: result.data.length,
+  duration: Date.now() - startTime
+});
+// result.data contains all 50,000 records in memory
+// STEP 2: Configure chunking
+const RECORDS_PER_FILE = 1000; // Configurable chunk size
+const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
+// STEP 3: Split records into chunks
+const chunks: any[][] = [];
+for (let i = 0; i < result.data.length; i += RECORDS_PER_FILE) {
+  chunks.push(result.data.slice(i, i + RECORDS_PER_FILE));
+}
+log.info('📂 File splitting configured', {
+  totalRecords: result.data.length,
+  recordsPerFile: RECORDS_PER_FILE,
+  filesToCreate: chunks.length,
+});
+// STEP 4: Generate files with configurable naming pattern
+const filePromises = chunks.map((chunk, index) => {
+  // File naming pattern: entity-timestamp-part-NNN.format
+  const partNumber = String(index + 1).padStart(3, '0'); // 001, 002, 003...
+  const filename = `virtual-positions-${timestamp}-part-${partNumber}.xml`;
+  // Transform chunk to desired format (XML, CSV, JSON)
+  const xmlContent = xmlBuilder.build({ records: chunk });
+  // Return upload promise
+  return sftp.uploadFile(
+    `/outbound/${filename}`,
+    Buffer.from(xmlContent, 'utf8'),
+    { encoding: 'utf8', overwrite: false }
+  );
+});
+// STEP 5: Write all files in parallel
+const uploadStartTime = Date.now();
+await Promise.all(filePromises);
+log.info('✅ File splitting complete', {
+  filesCreated: chunks.length,
+  totalRecords: result.data.length,
+  recordsPerFile: RECORDS_PER_FILE,
+  uploadDuration: Date.now() - uploadStartTime,
+  totalDuration: Date.now() - startTime
+});
+```
+### Configurable File Naming Patterns
+Support multiple naming patterns via configuration:
+```typescript
+// Configuration interface
+interface FileSplittingConfig {
+  enabled: boolean;
+  recordsPerFile: number;
+  namingPattern: 'sequential' | 'timestamp' | 'range' | 'custom';
+  customPattern?: (index: number, chunk: any[], total: number) => string;
+}
+// Pattern examples
+const patterns = {
+  // Pattern 1: Sequential numbering
+  sequential: (index: number) =>
+    `virtual-positions-part-${String(index + 1).padStart(3, '0')}.xml`,
+  // Pattern 2: Timestamp-based
+  timestamp: (index: number) =>
+    `virtual-positions-${new Date().toISOString()}-${index + 1}.xml`,
+  // Pattern 3: Record range
+  range: (index: number, chunk: any[]) => {
+    const start = index * chunk.length + 1;
+    const end = start + chunk.length - 1;
+    return `virtual-positions-records-${start}-${end}.xml`;
+  },
+  // Pattern 4: Custom with metadata
+  custom: (index: number, chunk: any[], totalChunks: number) =>
+    `VP_${new Date().toISOString().split('T')[0]}_${index + 1}_of_${totalChunks}.xml`,
+};
+```
+### Complete Example: Extraction with File Splitting
+```typescript
+import { schedule, fn, MemoryInterpreter } from '@versori/run';
+import { Buffer } from 'node:buffer';
+import {
+  ExtractionOrchestrator,
+  createClient,
+  SftpDataSource,
+  VersoriKVAdapter,
+  XMLBuilder,
+} from '@fluentcommerce/fc-connect-sdk';
+export const virtualPositionsExtraction = schedule(
+  'virtual-positions-hourly',
+  '0 * * * *'
+).then(
+  fn('extract', async (ctx) => {
+    const { log, activation, openKv } = ctx;
+    const startTime = Date.now();
+    log.info('📦 Starting virtual positions extraction');
+    // Configuration from activation variables
+    const FILE_SPLITTING_ENABLED = activation.getVariable('fileSplittingEnabled') !== 'false';
+    const RECORDS_PER_FILE = parseInt(activation.getVariable('recordsPerFile') || '1000', 10);
+    const FILE_NAMING_PATTERN = activation.getVariable('fileNamingPattern') || 'sequential';
+    try {
+      // Initialize services
+      const client = await createClient(ctx);
+      log.info('✅ Client initialized');
+      const kv = new VersoriKVAdapter(openKv(':project:'));
+      const orchestrator = new ExtractionOrchestrator(client, log);
+      // Extract records (all loaded into memory)
+      const extractionStartTime = Date.now();
+      const result = await orchestrator.extract({
+        query: virtualPositionsQuery,
+        variables: {
+          retailerId: activation.getVariable('fluentRetailerId'),
+          updatedAfter: await kv.get(['extraction', 'lastRunTime']) || '2025-01-01T00:00:00Z',
+        },
+        pagination: { pageSize: 200, maxRecords: 50000 },
+      });
+      log.info('✅ Extraction complete', {
+        recordCount: result.data.length,
+        fileSplittingEnabled: FILE_SPLITTING_ENABLED,
+        extractionDuration: Date.now() - extractionStartTime
+      });
+      // Initialize SFTP
+      const sftp = new SftpDataSource({
+        type: 'SFTP_XML',
+        connectionId: 'sftp-extractions',
+        name: 'extraction-sftp',
+        settings: {
+          host: activation.getVariable('sftpHost'),
+          port: parseInt(activation.getVariable('sftpPort') || '22', 10),
+          username: activation.getVariable('sftpUsername'),
+          password: activation.getVariable('sftpPassword'),
+          remotePath: '/outbound/',
+          requireAbsolutePaths: true,
+        },
+      }, log);
+      await sftp.validateConnection();
+      log.info('✅ SFTP connection validated');
+      // File splitting logic
+      if (FILE_SPLITTING_ENABLED && result.data.length > RECORDS_PER_FILE) {
+        const splittingStartTime = Date.now();
+        // Split into chunks
+        const chunks: any[][] = [];
+        for (let i = 0; i < result.data.length; i += RECORDS_PER_FILE) {
+          chunks.push(result.data.slice(i, i + RECORDS_PER_FILE));
+        }
+        log.info('📂 Splitting extraction into multiple files', {
+          totalRecords: result.data.length,
+          recordsPerFile: RECORDS_PER_FILE,
+          filesToCreate: chunks.length,
+        });
+        // Generate files
+        const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
+        const xmlBuilder = new XMLBuilder();
+        const filePromises = chunks.map(async (chunk, index) => {
+          // Generate filename based on pattern
+          let filename: string;
+          const partNumber = String(index + 1).padStart(3, '0');
+          switch (FILE_NAMING_PATTERN) {
+            case 'timestamp':
+              filename = `virtual-positions-${timestamp}-${partNumber}.xml`;
+              break;
+            case 'range':
+              const start = index * RECORDS_PER_FILE + 1;
+              const end = Math.min(start + chunk.length - 1, result.data.length);
+              filename = `virtual-positions-records-${start}-${end}.xml`;
+              break;
+            case 'sequential':
+            default:
+              filename = `virtual-positions-part-${partNumber}.xml`;
+              break;
+          }
+          // Build XML content
+          const xmlContent = xmlBuilder.build({ virtualPositions: chunk });
+          // Upload to SFTP
+          await sftp.uploadFile(
+            `/outbound/${filename}`,
+            Buffer.from(xmlContent, 'utf8'),
+            { encoding: 'utf8', overwrite: false }
+          );
+          return { filename, recordCount: chunk.length };
+        });
+        // Write all files in parallel
+        const uploadedFiles = await Promise.all(filePromises);
+        log.info('✅ File splitting complete', {
+          filesCreated: uploadedFiles.length,
+          totalRecords: result.data.length,
+          files: uploadedFiles,
+          splittingDuration: Date.now() - splittingStartTime,
+          totalDuration: Date.now() - startTime
+        });
+        return {
+          success: true,
+          extractionMode: 'incremental',
+          totalRecords: result.data.length,
+          filesCreated: uploadedFiles.length,
+          files: uploadedFiles,
+          duration: Date.now() - startTime
+        };
+      } else {
+        // Single file (no splitting)
+        const timestamp = new Date().toISOString().replace(/[:.]/g, '-');
+        const filename = `virtual-positions-${timestamp}.xml`;
+        const xmlBuilder = new XMLBuilder();
+        const xmlContent = xmlBuilder.build({ virtualPositions: result.data });
+        await sftp.uploadFile(
+          `/outbound/${filename}`,
+          Buffer.from(xmlContent, 'utf8'),
+          { encoding: 'utf8', overwrite: false }
+        );
+        log.info('✅ Single file extraction complete', {
+          filename,
+          recordCount: result.data.length,
+          duration: Date.now() - startTime
+        });
+        return {
+          success: true,
+          extractionMode: 'incremental',
+          totalRecords: result.data.length,
+          filesCreated: 1,
+          files: [{ filename, recordCount: result.data.length }],
+          duration: Date.now() - startTime
+        };
+      }
+    } catch (error: any) {
+      log.error('❌ Extraction failed', {
+        message: error instanceof Error ? error.message : String(error),
+        stack: error instanceof Error ? error.stack : undefined,
+        errorType: error instanceof Error ? error.constructor.name : 'Error',
+        duration: Date.now() - startTime
+      });
+      return {
+        success: false,
+        error: error.message,
+        duration: Date.now() - startTime
+      };
+    } finally {
+      if (sftp) {
+        await sftp.dispose();
+        log.info('✅ SFTP connection disposed');
+      }
+    }
+  })
+);
+// Export with MemoryInterpreter for Versori platform
+export const interpreter = new MemoryInterpreter({
+  workflows: [virtualPositionsExtraction]
+});
+```
+### Date Range Splitting for Very Large Datasets
+For extractions spanning multiple years (e.g., historical dumps), split the date range and run multiple smaller extractions:
+```typescript
+// Split large date range into manageable chunks
+async function extractHistoricalData(
+  startDate: string,  // "2020-01-01"
+  endDate: string,    // "2025-01-01"
+  chunkMonths: number = 1,  // Extract 1 month at a time
+  log: any
+) {
+  const overallStartTime = Date.now();
+  const results = [];
+  let currentStart = new Date(startDate);
+  const finalEnd = new Date(endDate);
+  log.info('📦 Starting historical data extraction', {
+    startDate,
+    endDate,
+    chunkMonths,
+    estimatedChunks: Math.ceil((finalEnd.getTime() - currentStart.getTime()) / (chunkMonths * 30 * 24 * 60 * 60 * 1000))
+  });
+  while (currentStart < finalEnd) {
+    const chunkStartTime = Date.now();
+    // Calculate chunk end date
+    const currentEnd = new Date(currentStart);
+    currentEnd.setMonth(currentEnd.getMonth() + chunkMonths);
+    // Ensure we don't exceed final end date
+    if (currentEnd > finalEnd) {
+      currentEnd.setTime(finalEnd.getTime());
+    }
+    log.info('📂 Extracting date range chunk', {
+      chunkNumber: results.length + 1,
+      startDate: currentStart.toISOString(),
+      endDate: currentEnd.toISOString(),
+    });
+    // Extract for this date range
+    const result = await orchestrator.extract({
+      query: virtualPositionsQuery,
+      variables: {
+        retailerId: 'my-retailer',
+        updatedAfter: currentStart.toISOString(),
+        updatedBefore: currentEnd.toISOString(),
+      },
+      pagination: { pageSize: 200, maxRecords: 50000 },
+    });
+    log.info('✅ Chunk extraction complete', {
+      chunkNumber: results.length + 1,
+      recordCount: result.data.length,
+      chunkDuration: Date.now() - chunkStartTime
+    });
+    // Split this chunk into files if needed
+    if (result.data.length > RECORDS_PER_FILE) {
+      const chunks: any[][] = [];
+      for (let i = 0; i < result.data.length; i += RECORDS_PER_FILE) {
+        chunks.push(result.data.slice(i, i + RECORDS_PER_FILE));
+      }
+      // Write chunks in parallel
+      const timestamp = currentStart.toISOString().split('T')[0];
+      await Promise.all(
+        chunks.map((chunk, index) => {
+          const filename = `historical-${timestamp}-part-${String(index + 1).padStart(3, '0')}.xml`;
+          return sftp.uploadFile(`/outbound/${filename}`, Buffer.from(xmlBuilder.build(chunk), 'utf8'));
+        })
+      );
+    }
+    results.push({
+      startDate: currentStart.toISOString(),
+      endDate: currentEnd.toISOString(),
+      recordCount: result.data.length,
+      duration: Date.now() - chunkStartTime
+    });
+    // Move to next chunk
+    currentStart = new Date(currentEnd);
+    // Rate limiting - wait between chunks
+    log.info('⏱️ Rate limiting delay (5 seconds)');
+    await new Promise(resolve => setTimeout(resolve, 5000)); // 5 second delay
+  }
+  const totalRecords = results.reduce((sum, r) => sum + r.recordCount, 0);
+  const totalDuration = Date.now() - overallStartTime;
+  log.info('✅ Historical extraction complete', {
+    totalChunks: results.length,
+    totalRecords,
+    totalDuration,
+    averageRecordsPerChunk: Math.round(totalRecords / results.length)
+  });
+  return results;
+}
+// Usage: Extract 5 years of data, 1 month at a time
+const historicalResults = await extractHistoricalData('2020-01-01', '2025-01-01', 1, log);
+```
+### Configuration Variables for File Splitting
+Add these to Versori activation variables:
+```bash
+# File Splitting Configuration
+FILE_SPLITTING_ENABLED=true              # Enable/disable file splitting
+RECORDS_PER_FILE=1000                    # Records per file (default: 1000)
+FILE_NAMING_PATTERN=sequential           # Naming pattern: sequential | timestamp | range
+MAX_RECORDS_PER_EXTRACTION=50000         # Maximum records to extract in one run
+# Date Range Splitting (for large historical extractions)
+DATE_RANGE_CHUNK_MONTHS=1                # Months per extraction chunk (default: 1)
+RATE_LIMIT_DELAY_MS=5000                 # Delay between chunks (milliseconds)
+```
+### Benefits of File Splitting
+- ✅ **Memory efficiency** - Process large datasets without memory overflow
+- ✅ **Parallel writes** - Multiple files written concurrently (faster)
+- ✅ **Better error recovery** - Retry individual files vs entire extraction
+- ✅ **Downstream compatibility** - Honor partner file size limits
+- ✅ **Audit granularity** - Track processing per file
+- ✅ **Network resilience** - Smaller files = better upload success rate
+### Limitations
+- All records must fit in memory during extraction (ExtractionOrchestrator loads all pages)
+- File splitting happens post-extraction (not during pagination)
+- Parallel writes limited by available memory and network bandwidth
+- SFTP connection pool size may limit concurrency (default: 10 connections)
+---
+## Historical Data Extraction via Date Range
+### Overview
+There is **no separate "historical mode"** in the SDK. To extract historical data (e.g., all records from 2020 onwards), use **Date Range mode** with a very old start date and appropriate safeguards.
+### How to Extract Historical Data
+```typescript
+// Historical extraction is just Date Range with old start date
+await client.graphql({
+  query: PRODUCTS_QUERY,
+  variables: {
+    retailerId: 'my-retailer',
+    updatedAfter: '1970-01-01T00:00:00Z',  // Very old date = "all records"
+    updatedBefore: new Date().toISOString(), // Up to now
+    first: 200,
+  },
+  pagination: { maxRecords: 50000 },
+});
+```
+### Required Safeguards for Historical Extraction
+1. **Date Range Splitting** - Split into smaller chunks (monthly/quarterly)
+2. **File Splitting** - Split large results into multiple files
+3. **Rate Limiting** - Add delays between chunks
+4. **Validation** - Verify date ranges before execution
+5. **Monitoring** - Track progress and alert on anomalies
+6. **Approval** - Get sign-off before running on production
+### Recommended Approach: Chunked Date Range Extraction
+```bash
+#!/bin/bash
+# Safe historical extraction via chunked date ranges
+START_DATE="2020-01-01"
+END_DATE="2025-01-01"
+# Extract one month at a time
+current=$START_DATE
+while [[ "$current" < "$END_DATE" ]]; do
+  # Calculate month end
+  monthEnd=$(date -d "$current + 1 month" +%Y-%m-%d)
+  echo "Extracting $current to $monthEnd..."
+  # Trigger date range extraction
+  curl -X POST https://versori-webhook.com/extract \
+    -H "Content-Type: application/json" \
+    -d "{
+      \"extractionMode\": \"dateRange\",
+      \"startDate\": \"${current}T00:00:00Z\",
+      \"endDate\": \"${monthEnd}T00:00:00Z\",
+      \"fileSplittingEnabled\": true,
+      \"recordsPerFile\": 1000
+    }"
+  # Rate limiting - wait 60 seconds between chunks
+  sleep 60
+  # Move to next month
+  current=$monthEnd
+done
+echo "Historical extraction complete"
+```
+### Migration to Incremental After Historical Load
+After completing a one-time historical extraction, switch to incremental mode for ongoing syncs:
+```json
+{
+  "extractionMode": "incremental",
+  "fallbackStartDate": "2025-01-22T00:00:00Z",
+  "pageSize": 200,
+  "maxRecords": 10000
+}
+```
+---
+## Decision Tree: Which Mode to Use?
+```
+Start Here
+    │
+    ├─ Need recurring extractions? ─────────► Use INCREMENTAL ✅
+    │                                          (hourly/daily/every 15 min)
+    │                                          Tracks state, auto-recovery
+    │
+    └─ Need specific date range? ────────────► Use DATE RANGE ⚠️
+         │                                      (one-time, validate range)
+         │
+         ├─ Date range < 30 days? ──────────► Single DATE RANGE run
+         │                                     + file splitting if >10k records
+         │
+         └─ Date range > 30 days? ──────────► Split into monthly chunks
+              │                               + file splitting per chunk
+              │                               + rate limiting between chunks
+              │
+              └─ Historical (all data)? ────► DATE RANGE with old start date
+                                              + chunked approach (monthly)
+                                              + approval required
+```
+---
+## Monitoring & Alerts
+Set up alerts for extraction volumes:
+```typescript
+// In extraction workflow
+const recordCount = edges.length;
+const ALERT_THRESHOLD = 50000;
+log.info('📊 Checking extraction volume', { recordCount, threshold: ALERT_THRESHOLD });
+if (recordCount > ALERT_THRESHOLD) {
+  log.error('❌ Extraction volume exceeded threshold', {
+    recordCount,
+    threshold: ALERT_THRESHOLD,
+    percentageOver: Math.round(((recordCount - ALERT_THRESHOLD) / ALERT_THRESHOLD) * 100),
+    mode: extractionMode,
+    recommendation: 'Switch to incremental mode or reduce date range',
+  });
+  // Send alert to monitoring system
+  await sendAlert({
+    severity: 'high',
+    message: `Extraction returned ${recordCount} records (threshold: ${ALERT_THRESHOLD})`,
+    mode: extractionMode,
+  });
+  log.info('✅ Alert sent to monitoring system');
+} else {
+  log.info('✅ Extraction volume within acceptable limits', {
+    recordCount,
+    threshold: ALERT_THRESHOLD,
+    percentageUsed: Math.round((recordCount / ALERT_THRESHOLD) * 100)
+  });
+}
+```
+---
+## Summary & Best Practices
+### ✅ DO
+- Use **incremental mode** for all scheduled extractions
+- Validate date ranges before running **dateRange mode**
+- Implement **file splitting** for large extractions (>10k records)
+- Use **parallel writes** with `Promise.all()` for split files
+- Monitor extraction volumes and set alerts
+- Test on staging before production
+- Use overlap buffer (60s) to prevent gaps in incremental mode
+- Track state with VersoriKV
+- Split large historical extractions into monthly chunks
+- Add rate limiting between extraction chunks
+### ❌ DON'T
+- Schedule **dateRange** extractions (incremental only)
+- Use date ranges > 30 days without chunking
+- Skip validation checks for date ranges
+- Ignore volume alerts (>50k records)
+- Forget to implement file splitting for large results
+- Run historical extractions without approval and monitoring
+---
+## ExtractionOrchestrator (SDK v0.1.27+)
+The SDK includes **ExtractionOrchestrator** - a high-level service that simplifies extraction workflows with built-in mode handling, pagination, and output management.
+### Why Use ExtractionOrchestrator?
+**Instead of manually implementing:**
+- Mode detection (incremental/dateRange)
+- Pagination loops
+- Path-based field extraction
+- Output formatting (CSV/JSON/Parquet)
+- S3/SFTP uploads
+- Error handling
+**ExtractionOrchestrator handles it all:**
+```typescript
+import { ExtractionOrchestrator, createClient } from '@fluentcommerce/fc-connect-sdk';
+const startTime = Date.now();
+log.info('📦 Initializing ExtractionOrchestrator');
+const client = await createClient(ctx);
+const orchestrator = new ExtractionOrchestrator(client, log);
+log.info('✅ Orchestrator initialized');
+// Both modes supported with single interface
+const result = await orchestrator.extract({
+  query: virtualPositionsQuery,
+  variables: { retailerId: 'my-retailer' },
+  // Mode: 'incremental' (scheduled) or 'dateRange' (ad-hoc)
+  extractionMode: 'incremental',
+  stateKey: 'virtual-positions-extraction',
+  // Pagination handled automatically
+  pagination: {
+    pageSize: 200,
+    maxRecords: 10000,
+  },
+  // Output format and destination
+  outputFormat: 'csv',
+  outputDestination: {
+    type: 's3',
+    bucket: 'my-extracts',
+    key: 'virtual-positions/hourly/{{timestamp}}.csv',
+  },
+  // Field extraction from nested paths
+  fieldPaths: {
+    position_ref: 'ref',
+    quantity: 'quantity',
+    location_ref: 'locationLink.ref',
+    location_name: 'locationLink.name',
+  },
+});
+log.info('✅ Extraction and upload complete', {
+  recordCount: result.recordCount,
+  outputFile: result.outputFile,
+  duration: Date.now() - startTime
+});
+```
+### Features
+- **Auto-pagination**: Handles cursor-based pagination automatically
+- **Mode support**: Both modes (incremental, dateRange)
+- **State management**: Tracks last run timestamps for incremental mode
+- **Path extraction**: Extracts nested fields from GraphQL responses
+- **Multi-format**: Outputs CSV, JSON, or Parquet
+- **Validation**: Built-in query and response validation
+- **Error recovery**: Graceful failure handling with detailed logs
+- **File splitting**: Post-extraction chunking for large datasets (add manually, see File Splitting section above)
+### Example: Incremental Extraction with ExtractionOrchestrator
+```typescript
+import { schedule, fn, MemoryInterpreter } from '@versori/run';
+import { Buffer } from 'node:buffer';  // Required for Deno/Versori runtime
+import {
+  ExtractionOrchestrator,
+  createClient,
+  VersoriKVAdapter,
+} from '@fluentcommerce/fc-connect-sdk';
+export const hourlyExtraction = schedule('hourly-virtual-positions', '0 * * * *').then(
+  fn('extract', async (ctx) => {
+    const { log, openKv, env } = ctx;
+    const startTime = Date.now();
+    log.info('📦 Starting hourly virtual positions extraction');
+    const client = await createClient(ctx);
+    log.info('✅ Client initialized');
+    const kv = new VersoriKVAdapter(openKv(':project:'));
+    const orchestrator = new ExtractionOrchestrator(client, log);
+    const result = await orchestrator.extract({
+      query: `
+        query GetVirtualPositions($retailerId: String!, $updatedAfter: String, $first: Int, $after: String) {
+          virtualPositions(retailerId: $retailerId, updatedAfter: $updatedAfter, first: $first, after: $after) {
+            edges {
+              node {
+                ref
+                quantity
+                productRef
+                locationLink { ref name }
+                updatedOn
+              }
+              cursor
+            }
+            pageInfo {
+              hasNextPage
+              # Note: Fluent doesn't return endCursor - cursors are in edges[].cursor
+            }
+          }
+        }
+      `,
+      variables: { retailerId: env.FLUENT_RETAILER_ID },
+      extractionMode: 'incremental',
+      stateAdapter: kv,
+      stateKey: 'hourly-extraction',
+      fallbackStartDate: '2025-01-01T00:00:00Z',
+      pagination: { pageSize: 200, maxRecords: 10000 },
+      outputFormat: 'csv',
+      outputDestination: {
+        type: 's3',
+        bucket: env.S3_BUCKET,
+        key: `virtual-positions/{{date}}/{{timestamp}}.csv`,
+        config: {
+          accessKeyId: env.AWS_ACCESS_KEY_ID,
+          secretAccessKey: env.AWS_SECRET_ACCESS_KEY,
+          region: env.AWS_REGION,
+        },
+      },
+    });
+    log.info('✅ Extraction complete', {
+      success: result.success,
+      recordCount: result.recordCount,
+      outputFile: result.outputFile,
+      duration: Date.now() - startTime
+    });
+    return {
+      success: result.success,
+      recordCount: result.recordCount,
+      outputFile: result.outputFile,
+      duration: Date.now() - startTime
+    };
+  })
+);
+// Export with MemoryInterpreter for Versori platform
+export const interpreter = new MemoryInterpreter({
+  workflows: [hourlyExtraction]
+});
+```
+### When to Use ExtractionOrchestrator
+- **✅ Use for**: New extraction workflows, scheduled extractions, standard use cases
+- **⚠️ Manual approach**: Complex transformations, custom business logic, non-standard outputs
+See [ExtractionOrchestrator API Reference](../../../../02-CORE-GUIDES/extraction/modules/02-core-guides-extraction-08-extraction-orchestrator.md) for complete documentation.
+---
+## See Also
+- [CLI Validation Workflow](../../../../02-CORE-GUIDES/api-reference/modules/api-reference-11-cli-tools.md) - Validate queries and mappings
+- [Production Safety Guide](../../../../02-CORE-GUIDES/ingestion/modules/02-core-guides-ingestion-09-best-practices.md) - General safety practices
+- [GraphQL Query Examples](./graphql-queries/) - Sample extraction queries
+- [Universal Mapping Guide](../../../../02-CORE-GUIDES/advanced-services/advanced-services-readme.md) - Field mapping documentation
+- [ExtractionOrchestrator Examples](../../../../03-PATTERN-GUIDES/examples/test-data/03-PATTERN-GUIDES-readme.md) - Complete working examples