RubyGems - csvops - Versions diffs - 0.6.0.alpha → 0.7.0.alpha - Mend

csvops 0.6.0.alpha → 0.7.0.alpha

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (43) hide show

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: f7db22cb84c1d08c58b473368f9ad37575a217d6293539309277ed2b032a2852
-  data.tar.gz: 124bebc822fefa5d1f71286701959876260c82164067c36ff94b712a0b4cc1b3
+  metadata.gz: 803fa825ef1f50edcd7c0bc032a86926d356cb3ba6d943c460d59759a953fdcd
+  data.tar.gz: 2ba2afc9951aa96e777cbf3ea81dc77a41c88d2546505c885302607432461633
 SHA512:
-  metadata.gz: a8b8dbcfb66073f46f0ecc625267081fbe730e69ef9295f5d2303af6b831a9d71ef564f78f5b44212eb33c4ad7a5fdb78b54fa98e21dd58669e9494a5d3325fb
-  data.tar.gz: 05cbcaa2ca3116ad463413e53600d32a53df0941ceb8873ed22c2ef2d4cfe1afc8f90e44c7ff4400212ebbd5083a2ceb6a983281291436e9478d5087cc98b9ad
+  metadata.gz: 4f82dd7e9d3ac5ff53f8aaf40a0e5500e9b074aa052a031f6de4f5a2cc1ab711a5c375d5c203bdfaae802d36a02ecf14c4f73231a9f14e31d2f042ffeecd9a08
+  data.tar.gz: f9428d2ef29d257c99b484c7277dcff566dd5cf09ec06b78b4514c410b7858ffd6854f8aafd39a727c9c3d1e44e6940bc15456f3b11fdcac4a5b879bee9cc826

data/README.md CHANGED Viewed

@@ -38,11 +38,12 @@ CSV Tool Menu
 3. Randomize rows
 4. Dedupe using another CSV
 5. Validate parity
-6. Exit
+6. Split CSV into chunks
+7. Exit
 >
 ```
-Select `1` for column extraction, `2` for row-range extraction, `3` for row randomization, `4` for cross-CSV dedupe, or `5` for parity validation.
+Select `1` for column extraction, `2` for row-range extraction, `3` for row randomization, `4` for cross-CSV dedupe, `5` for parity validation, or `6` for CSV splitting.
 ### 3. Follow prompts
@@ -61,6 +62,7 @@ Prompt flow by action:
 - `Randomize rows`: file path, separator, headers present, optional seed, output destination.
 - `Dedupe using another CSV`: source/reference files, separators, header modes, key selectors, match options, output destination.
 - `Validate parity`: left/right files, separator, header mode, parity summary, mismatch samples.
+- `Split CSV into chunks`: source file, separator, header mode, chunk size, output directory/prefix, overwrite policy, optional manifest.
 ### 4. Example interaction (console output)
@@ -129,10 +131,11 @@ Legend: ` ` = prompt/menu, `+` = user input, `-` = tool output
  CSV Tool Menu
  1. Extract column
  2. Extract rows (range)
- 3. Randomize rows
- 4. Dedupe using another CSV
- 5. Validate parity
- 6. Exit
+3. Randomize rows
+4. Dedupe using another CSV
+5. Validate parity
+ 6. Split CSV into chunks
+ 7. Exit
 +> 4
  CSV file path: /tmp/source.csv
  Source CSV separator:
@@ -177,10 +180,11 @@ Legend: ` ` = prompt/menu, `+` = user input, `-` = tool output
  CSV Tool Menu
  1. Extract column
  2. Extract rows (range)
- 3. Randomize rows
- 4. Dedupe using another CSV
- 5. Validate parity
- 6. Exit
+3. Randomize rows
+4. Dedupe using another CSV
+5. Validate parity
+ 6. Split CSV into chunks
+ 7. Exit
 +> 5
  Left CSV file path: /tmp/left.csv
  Right CSV file path: /tmp/right.csv
@@ -208,6 +212,41 @@ Legend: ` ` = prompt/menu, `+` = user input, `-` = tool output
 - Exact duplicate semantics are preserved by count deltas per normalized row value.
 - Memory scales with the number of distinct row keys in the parity map, not the total input row count.
+### 10. Split interaction example
+Legend: ` ` = prompt/menu, `+` = user input, `-` = tool output
+```diff
+ CSV Tool Menu
+ 1. Extract column
+ 2. Extract rows (range)
+ 3. Randomize rows
+ 4. Dedupe using another CSV
+ 5. Validate parity
+ 6. Split CSV into chunks
+ 7. Exit
++> 6
+ Source CSV file path: /tmp/people.csv
+ Choose separator:
+ 1. comma (,)
+ 2. tab (\t)
+ 3. semicolon (;)
+ 4. pipe (|)
+ 5. custom
++Separator choice [1]: 1
+ Headers present? [Y/n]:
++Rows per chunk: 1000
+ Output directory [/tmp]:
+ Output file prefix [people]:
+ Overwrite existing chunk files? [y/N]:
+ Write manifest file? [y/N]:
+-Split complete.
+-Chunk size: 1000
+-Data rows: 25000
+-Chunks written: 25
+-/tmp/people_part_001.csv
+```
 ## Testing
 Run tests:
@@ -224,7 +263,7 @@ bundle exec rake test
 ## Alpha release
-Current prerelease version: `0.5.0.alpha`
+Current prerelease version: `0.7.0.alpha`
 Install prerelease from RubyGems:
@@ -234,7 +273,7 @@ gem install csvops --pre
 Release runbook:
-- `docs/release-v0.5.0-alpha.md`
+- `docs/release-v0.7.0-alpha.md`
 ## Architecture

data/docs/architecture.md CHANGED Viewed

@@ -2,15 +2,15 @@
 The codebase follows a DDD-lite layered structure:
-- `domain/`: core domain models and invariants (`ColumnSession`, `RowSession`, `RandomizationSession`, and `CrossCsvDedupeSession` aggregates + supporting entities/value objects).
-- `application/`: use-case orchestration (`RunExtraction`, `RunRowExtraction`, `RunRowRandomization`, `RunCrossCsvDedupe`, `RunCsvParity`).
+- `domain/`: core domain models and invariants (`ColumnSession`, `RowSession`, `RandomizationSession`, `CrossCsvDedupeSession`, and `CsvSplitSession` aggregates + supporting entities/value objects).
+- `application/`: use-case orchestration (`RunExtraction`, `RunRowExtraction`, `RunRowRandomization`, `RunCrossCsvDedupe`, `RunCsvParity`, `RunCsvSplit`).
 - `infrastructure/`: CSV reading/streaming/comparison and output adapters (console/file).
 - `interface/cli/`: menu, prompts, workflows, and user-facing error presentation.
 - `Csvtool::CLI`: entrypoint wiring from command args to interface/application flow.
 ## Workflow boundary (standardized)
-For all interactive domains (`Column Extraction`, `Row Extraction`, `Row Randomization`, `Cross-CSV Dedupe`, `CSV Parity`), the boundary is:
+For all interactive domains (`Column Extraction`, `Row Extraction`, `Row Randomization`, `Cross-CSV Dedupe`, `CSV Parity`, `CSV Split`), the boundary is:
 - `interface/cli/workflows/*`: owns prompts, stdout rendering, and user-facing error presentation.
 - `interface/cli/workflows/builders/*`: builds domain sessions/aggregates from prompt results.
@@ -33,6 +33,7 @@ Current usage:
 - `RunRowRandomizationWorkflow` uses `WorkflowStepPipeline` + `Steps::RowRandomization::*`.
 - `RunCrossCsvDedupeWorkflow` uses `WorkflowStepPipeline` + `Steps::CrossCsvDedupe::*`.
 - `RunCsvParityWorkflow` uses `WorkflowStepPipeline` + `Steps::Parity::*`.
+- `RunCsvSplitWorkflow` uses `WorkflowStepPipeline` + `Steps::CsvSplit::*`.
 ## Adding New Concepts
@@ -108,7 +109,7 @@ For a new function type, prefer one of these patterns:
 ## Domain model
-Bounded contexts: `Column Extraction`, `Row Extraction`, `Row Randomization`, `Cross-CSV Dedupe`, and `CSV Parity`.
+Bounded contexts: `Column Extraction`, `Row Extraction`, `Row Randomization`, `Cross-CSV Dedupe`, `CSV Parity`, and `CSV Split`.
 ### Cross-CSV Dedupe (Large-file behavior)
@@ -421,6 +422,60 @@ classDiagram
   RunCsvParity --> CsvParityComparator
 ```
+### CSV Split
+Core DDD structure:
+- Aggregate root: `SplitSession`
+  - Captures one CSV split request.
+  - Holds split source and split options.
+- Entities:
+  - `SplitSource` (path + separator + header mode)
+- Value objects:
+  - `SplitOptions` (chunk size, output directory, file prefix, overwrite policy, optional manifest configuration)
+- Application service:
+  - `Application::UseCases::RunCsvSplit` orchestrates split execution and returns request/result style payloads.
+- Infrastructure adapters:
+  - `Infrastructure::CSV::CsvSplitter` (streaming row-by-row chunk writer)
+  - `Infrastructure::Output::CsvSplitManifestWriter` (optional manifest output)
+- Interface adapters:
+  - `Interface::CLI::MenuLoop`
+  - `Interface::CLI::Workflows::RunCsvSplitWorkflow`
+  - `Interface::CLI::Workflows::Builders::CsvSplitSessionBuilder`
+  - `Interface::CLI::Workflows::Steps::WorkflowStepPipeline`
+  - `Interface::CLI::Workflows::Steps::CsvSplit::*`
+  - `Interface::CLI::Workflows::Presenters::CsvSplitPresenter`
+  - `Interface::CLI::Workflows::Support::ResultErrorHandler`
+  - `Interface::CLI::Prompts::*`
+  - `Interface::CLI::Errors::Presenter`
+```mermaid
+classDiagram
+  direction LR
+  class MenuLoop
+  class RunCsvSplitWorkflow
+  class Prompts
+  class Errors
+  class RunCsvSplit
+  class SplitSession
+  class SplitSource
+  class SplitOptions
+  class CsvSplitter
+  class CsvSplitManifestWriter
+  class CsvSplitPresenter
+  MenuLoop --> RunCsvSplitWorkflow : invokes
+  RunCsvSplitWorkflow --> Prompts : uses
+  RunCsvSplitWorkflow --> Errors : reports failures
+  RunCsvSplitWorkflow --> CsvSplitPresenter : renders
+  RunCsvSplitWorkflow --> RunCsvSplit : calls
+  RunCsvSplit --> SplitSession : orchestrates
+  SplitSession o-- SplitSource
+  SplitSession o-- SplitOptions
+  RunCsvSplit --> CsvSplitter
+  RunCsvSplit --> CsvSplitManifestWriter
+```
 ## Project layout
 ```text
@@ -431,12 +486,14 @@ lib/csvtool/domain/row_session/*
 lib/csvtool/domain/row_randomization_session/*
 lib/csvtool/domain/cross_csv_dedupe_session/*
 lib/csvtool/domain/csv_parity_session/*
+lib/csvtool/domain/csv_split_session/*
 lib/csvtool/domain/shared/output_destination.rb
 lib/csvtool/application/use_cases/run_extraction.rb
 lib/csvtool/application/use_cases/run_row_extraction.rb
 lib/csvtool/application/use_cases/run_row_randomization.rb
 lib/csvtool/application/use_cases/run_cross_csv_dedupe.rb
 lib/csvtool/application/use_cases/run_csv_parity.rb
+lib/csvtool/application/use_cases/run_csv_split.rb
 lib/csvtool/infrastructure/csv/*
 lib/csvtool/infrastructure/output/*
 lib/csvtool/interface/cli/menu_loop.rb

data/docs/release-v0.7.0-alpha.md ADDED Viewed

@@ -0,0 +1,87 @@
+# Release Checklist: v0.7.0-alpha
+## 1. Verify environment
+```bash
+ruby -v
+bundle -v
+```
+Expected:
+- Ruby `3.3.x`
+## 2. Install dependencies
+```bash
+bundle install
+```
+## 3. Run quality checks
+```bash
+bundle exec rake test
+```
+## 4. Smoke test CLI commands
+```bash
+bundle exec csvtool menu
+bundle exec csvtool column test/fixtures/sample_people.csv name
+```
+## 5. Smoke test workflows
+### CSV split workflow (new in this release)
+Use menu option `6` (`Split CSV into chunks`) and verify:
+- happy path split (`N=10`) writes expected chunk files and counts
+- separator and header mode options work (CSV/TSV/headerless/custom)
+- output directory + file prefix options produce expected paths
+- overwrite protection blocks existing chunk paths unless allowed
+- optional manifest output writes valid CSV metadata
+### Existing workflows regression pass
+Use menu options `1-5` and verify:
+- column extraction still works
+- row-range extraction still works
+- row randomization still works
+- cross-CSV dedupe still works
+- parity validation still works
+## 6. Build and validate gem package
+```bash
+gem build csvops.gemspec
+gem install ./csvops-0.7.0.alpha.gem
+csvtool menu
+```
+## 7. Commit release prep
+```bash
+git add -A
+git commit -m "chore(release): prepare v0.7.0-alpha"
+```
+## 8. Tag release
+```bash
+git tag -a v0.7.0-alpha -m "v0.7.0-alpha"
+git push origin main --tags
+```
+## 9. Publish gem
+```bash
+gem push csvops-0.7.0.alpha.gem
+```
+## 10. Create GitHub release
+Create release `v0.7.0-alpha` with:
+- New `Split CSV into chunks` workflow
+- Split-domain architecture (workflow steps, builder, presenter, use case, infrastructure adapters)
+- Output strategy improvements (directory/prefix/overwrite controls)
+- Optional split manifest output
+- Large-file streaming split coverage and docs updates

data/lib/csvtool/application/use_cases/run_csv_split.rb ADDED Viewed

@@ -0,0 +1,97 @@
+# frozen_string_literal: true
+require "csv"
+require "fileutils"
+require "csvtool/infrastructure/csv/header_reader"
+require "csvtool/infrastructure/csv/csv_splitter"
+require "csvtool/infrastructure/output/csv_split_manifest_writer"
+module Csvtool
+  module Application
+    module UseCases
+      class RunCsvSplit
+        Result = Struct.new(:ok, :error, :data, keyword_init: true) do
+          def ok?
+            ok
+          end
+        end
+        def initialize(
+          header_reader: Infrastructure::CSV::HeaderReader.new,
+          csv_splitter: Infrastructure::CSV::CsvSplitter.new,
+          csv_split_manifest_writer: Infrastructure::Output::CsvSplitManifestWriter.new
+        )
+          @header_reader = header_reader
+          @csv_splitter = csv_splitter
+          @csv_split_manifest_writer = csv_split_manifest_writer
+        end
+        def read_headers(file_path:, col_sep:, headers_present:)
+          return failure(:file_not_found, path: file_path) unless File.file?(file_path)
+          return success(headers: nil) unless headers_present
+          headers = @header_reader.call(file_path: file_path, col_sep: col_sep)
+          return failure(:no_headers) if headers.empty?
+          success(headers: headers)
+        rescue CSV::MalformedCSVError
+          failure(:could_not_parse_csv)
+        rescue Errno::EACCES
+          failure(:cannot_read_file, path: file_path)
+        end
+        def call(session:)
+          source = session.source
+          output_directory = session.options.output_directory || File.dirname(source.path)
+          file_prefix = session.options.file_prefix || File.basename(source.path, ".*")
+          FileUtils.mkdir_p(output_directory)
+          stats = @csv_splitter.call(
+            file_path: source.path,
+            col_sep: source.separator,
+            headers_present: source.headers_present,
+            chunk_size: session.options.chunk_size,
+            output_directory: output_directory,
+            file_prefix: file_prefix,
+            overwrite_existing: session.options.overwrite_existing
+          )
+          manifest_path = maybe_write_manifest(
+            session: session,
+            output_directory: output_directory,
+            file_prefix: file_prefix,
+            stats: stats
+          )
+          success(stats.merge(output_directory: output_directory, file_prefix: file_prefix, manifest_path: manifest_path))
+        rescue Infrastructure::CSV::CsvSplitter::OutputFileExistsError => e
+          failure(:output_file_exists, path: e.path)
+        rescue CSV::MalformedCSVError
+          failure(:could_not_parse_csv)
+        rescue Errno::EACCES, Errno::ENOENT => e
+          failure(:cannot_write_output_file, path: output_directory, error_class: e.class)
+        end
+        private
+        def success(data)
+          Result.new(ok: true, error: nil, data: data)
+        end
+        def failure(code, data = {})
+          Result.new(ok: false, error: code, data: data)
+        end
+        def maybe_write_manifest(session:, output_directory:, file_prefix:, stats:)
+          return nil unless session.options.write_manifest
+          manifest_path = session.options.manifest_path || File.join(output_directory, "#{file_prefix}_manifest.csv")
+          @csv_split_manifest_writer.call(
+            path: manifest_path,
+            chunk_paths: stats[:chunk_paths],
+            chunk_row_counts: stats[:chunk_row_counts]
+          )
+          manifest_path
+        end
+      end
+    end
+  end
+end

data/lib/csvtool/cli.rb CHANGED Viewed

@@ -7,6 +7,7 @@ require "csvtool/interface/cli/workflows/run_row_extraction_workflow"
 require "csvtool/interface/cli/workflows/run_row_randomization_workflow"
 require "csvtool/interface/cli/workflows/run_cross_csv_dedupe_workflow"
 require "csvtool/interface/cli/workflows/run_csv_parity_workflow"
+require "csvtool/interface/cli/workflows/run_csv_split_workflow"
 require "csvtool/interface/cli/errors/presenter"
 require "csvtool/infrastructure/csv/header_reader"
 require "csvtool/infrastructure/csv/value_streamer"
@@ -20,6 +21,7 @@ module Csvtool
       "Randomize rows",
       "Dedupe using another CSV",
       "Validate parity",
+      "Split CSV into chunks",
       "Exit"
     ].freeze
@@ -54,6 +56,7 @@ module Csvtool
       randomize_rows_action = -> { Interface::CLI::Workflows::RunRowRandomizationWorkflow.new(stdin: @stdin, stdout: @stdout).call }
       dedupe_action = -> { Interface::CLI::Workflows::RunCrossCsvDedupeWorkflow.new(stdin: @stdin, stdout: @stdout).call }
       parity_action = -> { Interface::CLI::Workflows::RunCsvParityWorkflow.new(stdin: @stdin, stdout: @stdout).call }
+      split_action = -> { Interface::CLI::Workflows::RunCsvSplitWorkflow.new(stdin: @stdin, stdout: @stdout).call }
       Interface::CLI::MenuLoop.new(
         stdin: @stdin,
         stdout: @stdout,
@@ -62,7 +65,8 @@ module Csvtool
         extract_rows_action: extract_rows_action,
         randomize_rows_action: randomize_rows_action,
         dedupe_action: dedupe_action,
-        parity_action: parity_action
+        parity_action: parity_action,
+        split_action: split_action
       ).run
     end

data/lib/csvtool/domain/csv_split_session/split_options.rb ADDED Viewed

@@ -0,0 +1,27 @@
+# frozen_string_literal: true
+module Csvtool
+  module Domain
+    module CsvSplitSession
+      class SplitOptions
+        attr_reader :chunk_size, :output_directory, :file_prefix, :overwrite_existing, :write_manifest, :manifest_path
+        def initialize(
+          chunk_size:,
+          output_directory: nil,
+          file_prefix: nil,
+          overwrite_existing: false,
+          write_manifest: false,
+          manifest_path: nil
+        )
+          @chunk_size = Integer(chunk_size)
+          @output_directory = output_directory
+          @file_prefix = file_prefix
+          @overwrite_existing = overwrite_existing
+          @write_manifest = write_manifest
+          @manifest_path = manifest_path
+        end
+      end
+    end
+  end
+end

data/lib/csvtool/domain/csv_split_session/split_session.rb ADDED Viewed

@@ -0,0 +1,20 @@
+# frozen_string_literal: true
+module Csvtool
+  module Domain
+    module CsvSplitSession
+      class SplitSession
+        attr_reader :source, :options
+        def self.start(source:, options:)
+          new(source: source, options: options)
+        end
+        def initialize(source:, options:)
+          @source = source
+          @options = options
+        end
+      end
+    end
+  end
+end

data/lib/csvtool/domain/csv_split_session/split_source.rb ADDED Viewed

@@ -0,0 +1,17 @@
+# frozen_string_literal: true
+module Csvtool
+  module Domain
+    module CsvSplitSession
+      class SplitSource
+        attr_reader :path, :separator, :headers_present
+        def initialize(path:, separator:, headers_present:)
+          @path = path
+          @separator = separator
+          @headers_present = headers_present
+        end
+      end
+    end
+  end
+end

data/lib/csvtool/infrastructure/csv/csv_splitter.rb ADDED Viewed

@@ -0,0 +1,64 @@
+# frozen_string_literal: true
+require "csv"
+module Csvtool
+  module Infrastructure
+    module CSV
+      class CsvSplitter
+        class OutputFileExistsError < StandardError
+          attr_reader :path
+          def initialize(path)
+            super("output file exists: #{path}")
+            @path = path
+          end
+        end
+        def call(file_path:, col_sep:, headers_present:, chunk_size:, output_directory:, file_prefix:, overwrite_existing:)
+          ext = File.extname(file_path)
+          ext = ".csv" if ext.empty?
+          sequence = 0
+          data_rows = 0
+          chunk_paths = []
+          chunk_row_counts = []
+          rows_in_chunk = 0
+          current_csv = nil
+          write_mode_headers = nil
+          write_headers = headers_present
+          ::CSV.foreach(file_path, headers: headers_present, col_sep: col_sep) do |row|
+            if current_csv.nil? || rows_in_chunk >= chunk_size
+              current_csv&.close
+              sequence += 1
+              rows_in_chunk = 0
+              path = File.join(output_directory, format("%<prefix>s_part_%<num>03d%<ext>s", prefix: file_prefix, num: sequence, ext: ext))
+              raise OutputFileExistsError.new(path) if File.exist?(path) && !overwrite_existing
+              chunk_paths << path
+              chunk_row_counts << 0
+              write_mode_headers = headers_present ? row.headers : nil
+              current_csv = ::CSV.open(path, "w", write_headers: write_headers, headers: write_mode_headers, col_sep: col_sep)
+            end
+            fields = headers_present ? row.fields : row
+            current_csv << fields
+            rows_in_chunk += 1
+            chunk_row_counts[-1] += 1
+            data_rows += 1
+          end
+          {
+            chunk_paths: chunk_paths,
+            chunk_count: chunk_paths.length,
+            data_rows: data_rows,
+            chunk_row_counts: chunk_row_counts
+          }
+        ensure
+          current_csv&.close unless current_csv&.closed?
+        end
+      end
+    end
+  end
+end

data/lib/csvtool/infrastructure/output/csv_split_manifest_writer.rb ADDED Viewed

@@ -0,0 +1,20 @@
+# frozen_string_literal: true
+require "csv"
+module Csvtool
+  module Infrastructure
+    module Output
+      class CsvSplitManifestWriter
+        def call(path:, chunk_paths:, chunk_row_counts:)
+          ::CSV.open(path, "w") do |csv|
+            csv << %w[chunk_index chunk_path row_count]
+            chunk_paths.each_with_index do |chunk_path, index|
+              csv << [index + 1, chunk_path, chunk_row_counts[index]]
+            end
+          end
+        end
+      end
+    end
+  end
+end

data/lib/csvtool/interface/cli/errors/presenter.rb CHANGED Viewed

@@ -33,6 +33,10 @@ module Csvtool
             @stdout.puts "Cannot write output file: #{path} (#{error_class})"
           end
+          def output_file_exists(path)
+            @stdout.puts "Output file already exists: #{path}"
+          end
           def empty_output_path
             @stdout.puts "Output file path cannot be empty."
           end
@@ -53,6 +57,10 @@ module Csvtool
             @stdout.puts "Seed must be an integer."
           end
+          def invalid_chunk_size
+            @stdout.puts "Chunk size must be a positive integer."
+          end
           def canceled
             @stdout.puts "Canceled."
           end

data/lib/csvtool/interface/cli/menu_loop.rb CHANGED Viewed

@@ -4,7 +4,7 @@ module Csvtool
   module Interface
     module CLI
       class MenuLoop
-        def initialize(stdin:, stdout:, menu_options:, extract_column_action:, extract_rows_action:, randomize_rows_action:, dedupe_action:, parity_action:)
+        def initialize(stdin:, stdout:, menu_options:, extract_column_action:, extract_rows_action:, randomize_rows_action:, dedupe_action:, parity_action:, split_action:)
           @stdin = stdin
           @stdout = stdout
           @menu_options = menu_options
@@ -13,6 +13,7 @@ module Csvtool
           @randomize_rows_action = randomize_rows_action
           @dedupe_action = dedupe_action
           @parity_action = parity_action
+          @split_action = split_action
         end
         def run
@@ -34,9 +35,11 @@ module Csvtool
             when "5"
               @parity_action.call
             when "6"
+              @split_action.call
+            when "7"
               return 0
             else
-              @stdout.puts "Please choose 1, 2, 3, 4, 5, or 6."
+              @stdout.puts "Please choose 1, 2, 3, 4, 5, 6, or 7."
             end
           end
         end

data/lib/csvtool/interface/cli/prompts/chunk_size_prompt.rb ADDED Viewed

@@ -0,0 +1,21 @@
+# frozen_string_literal: true
+module Csvtool
+  module Interface
+    module CLI
+      module Prompts
+        class ChunkSizePrompt
+          def initialize(stdin:, stdout:)
+            @stdin = stdin
+            @stdout = stdout
+          end
+          def call
+            @stdout.print "Rows per chunk: "
+            @stdin.gets&.strip.to_s
+          end
+        end
+      end
+    end
+  end
+end