RubyGems - csv-utils - Versions diffs - 0.3.25 → 0.5.0 - Mend

csv-utils 0.3.25 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (28) hide show

checksums.yaml +4 -4
data/.github/workflows/ci.yml +53 -0
data/.rubocop.yml +81 -0
data/ARCHITECTURE.md +154 -0
data/CLAUDE.md +63 -0
data/Gemfile +2 -1
data/Gemfile.lock +5 -0
data/README.md +238 -16
data/bin/csv-diff +3 -3
data/bin/csv-duplicate-finder +1 -1
data/bin/csv-grep +3 -3
data/bin/csv-readline +4 -5
data/bin/csv-splitter +1 -1
data/bin/csv-validator +38 -36
data/csv-utils.gemspec +6 -5
data/lib/csv-utils.rb +3 -0
data/lib/csv_utils/csv_compare.rb +77 -71
data/lib/csv_utils/csv_extender.rb +45 -41
data/lib/csv_utils/csv_iterator.rb +90 -75
data/lib/csv_utils/csv_options.rb +11 -11
data/lib/csv_utils/csv_report.rb +5 -2
data/lib/csv_utils/csv_row.rb +3 -1
data/lib/csv_utils/csv_row_matcher.rb +34 -0
data/lib/csv_utils/csv_sort.rb +110 -96
data/lib/csv_utils/csv_transformer.rb +95 -92
data/lib/csv_utils/csv_wrapper.rb +40 -36
metadata +13 -6
data/docs/ARCHITECTURE.md +0 -134

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 9b30f6d77e44b6d4abca4ee199142b4efa6cb73895df87ba630b82f50b964fbc
-  data.tar.gz: 10337ebbb358c3e60f5f4d47756297577468b45bbaeaf4e1b0b355b12c9083a4
+  metadata.gz: 151a25f6d4d171b169ac194665f217151a96ef43aede166c6a62ec9f2b259765
+  data.tar.gz: b223b26ea97a29f58f532e6b737d481c4f144ce2e202592f8057300790156ed5
 SHA512:
-  metadata.gz: 84432b88f8fc0aee4422fe46a36bab6e99cca1d604469608e2e4ced97c1f748be66796bc7f0c161b660d31e6424f5b155b38bfb034e9cee234fd8c71459bb2d3
-  data.tar.gz: ad864bb734d1c9f7d6a99d97aa2567811c66f8bf317571a947d56196776c61abf229d3d3aad0f5485f1a39b64d257e5be234ffa19dcc9c8a975a045fffcd95b1
+  metadata.gz: 7afee881db16b4afce9cceb812d4994ee66bcf172db147eaf73911961983cd0d73f9c553ca549880793e1c9a2bb2904e8444f2618ff30d040808d1b4e5383cf7
+  data.tar.gz: 4811b6ff72153107bf029f36feef65ec68646e0365d3b3987bd3231b9842b20c8c27c72216ca25350bd469887315b139737eedeba3b7c7fa17d240c729ae9a5a

data/.github/workflows/ci.yml ADDED Viewed

@@ -0,0 +1,53 @@
+name: CI
+on:
+  push:
+    branches: [master]
+  pull_request:
+    branches: [master]
+jobs:
+  lint:
+    name: RuboCop
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - name: Set up Ruby
+        uses: ruby/setup-ruby@v1
+        with:
+          ruby-version: '3.4'
+          bundler-cache: true
+      - name: Run RuboCop
+        run: bundle exec rubocop --format github
+  test:
+    name: Tests (Ruby ${{ matrix.ruby }})
+    runs-on: ubuntu-latest
+    strategy:
+      fail-fast: false
+      matrix:
+        ruby: ['3.2', '3.3', '3.4']
+    steps:
+      - uses: actions/checkout@v4
+      - name: Set up Ruby ${{ matrix.ruby }}
+        uses: ruby/setup-ruby@v1
+        with:
+          ruby-version: ${{ matrix.ruby }}
+          bundler-cache: true
+      - name: Run tests
+        run: bundle exec rspec
+      - name: Upload coverage to Codecov
+        if: matrix.ruby == '3.4'
+        uses: codecov/codecov-action@v4
+        with:
+          files: coverage/coverage.xml
+          fail_ci_if_error: false
+          verbose: true
+        env:
+          CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}

data/.rubocop.yml ADDED Viewed

@@ -0,0 +1,81 @@
+AllCops:
+  TargetRubyVersion: 3.2
+  NewCops: enable
+  SuggestExtensions: false
+  Exclude:
+    - 'bin/**/*'
+    - 'vendor/**/*'
+    - 'coverage/**/*'
+# Relaxed metrics for existing codebase
+Metrics/MethodLength:
+  Max: 35
+Metrics/AbcSize:
+  Max: 40
+Metrics/ClassLength:
+  Max: 200
+Metrics/CyclomaticComplexity:
+  Max: 15
+Metrics/PerceivedComplexity:
+  Max: 15
+Metrics/BlockLength:
+  Exclude:
+    - 'spec/**/*'
+    - '*.gemspec'
+# Style preferences
+Style/Documentation:
+  Enabled: false
+Style/FrozenStringLiteralComment:
+  EnforcedStyle: always
+Layout/LineLength:
+  Max: 130
+  Exclude:
+    - 'spec/**/*'
+# File naming - allow hyphenated gem name
+Naming/FileName:
+  Exclude:
+    - 'lib/csv-utils.rb'
+# Allow set_ prefix for transformer DSL methods
+Naming/AccessorMethodName:
+  Exclude:
+    - 'lib/csv_utils/csv_transformer.rb'
+# Allow Kernel#open for pipe support (intentional design)
+Security/Open:
+  Exclude:
+    - 'lib/csv_utils/csv_wrapper.rb'
+# Duplicate branches are intentional in comparison logic
+Lint/DuplicateBranch:
+  Exclude:
+    - 'lib/csv_utils/csv_compare.rb'
+    - 'lib/csv_utils/csv_sort.rb'
+# Allow empty blocks in specs (used for testing iteration behavior)
+Lint/EmptyBlock:
+  Exclude:
+    - 'spec/**/*'
+# Style relaxations for existing code patterns
+Style/OptionalBooleanParameter:
+  Enabled: false
+Style/StringConcatenation:
+  Enabled: false
+Style/FormatStringToken:
+  Enabled: false
+# Gemspec settings
+Gemspec/RequiredRubyVersion:
+  Enabled: false

data/ARCHITECTURE.md ADDED Viewed

@@ -0,0 +1,154 @@
+# Architecture
+This document describes the internal architecture of the csv-utils gem.
+## Overview
+csv-utils is a Ruby gem providing utilities for manipulating, debugging, and processing CSV files. The library emphasizes handling malformed CSVs and large file processing through streaming and batch operations.
+## Core Design Principles
+1. **Streaming Over Loading** - Files are processed row-by-row rather than loading entire files into memory
+2. **Resource Management** - CSVWrapper ensures proper file handle lifecycle management
+3. **Batch Processing** - Large operations support configurable batch sizes to balance memory and performance
+4. **BOM Handling** - All readers strip UTF-8/16/32 byte order marks from headers
+## Component Architecture
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                        CSVUtils Module                          │
+├─────────────────────────────────────────────────────────────────┤
+│  Detection Layer                                                │
+│  ┌─────────────┐                                                │
+│  │ CSVOptions  │  Auto-detects separators, encoding, BOM       │
+│  └─────────────┘                                                │
+├─────────────────────────────────────────────────────────────────┤
+│  I/O Layer                                                      │
+│  ┌─────────────┐  ┌──────────────┐                              │
+│  │ CSVWrapper  │  │ CSVIterator  │  Enumerable, RowWrapper     │
+│  └─────────────┘  └──────────────┘                              │
+├─────────────────────────────────────────────────────────────────┤
+│  Processing Layer                                               │
+│  ┌───────────────┐  ┌─────────────┐  ┌─────────────┐           │
+│  │ CSVTransformer│  │ CSVExtender │  │  CSVSort    │           │
+│  │ (pipeline)    │  │ (append)    │  │ (merge sort)│           │
+│  └───────────────┘  └─────────────┘  └─────────────┘           │
+├─────────────────────────────────────────────────────────────────┤
+│  Analysis Layer                                                 │
+│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐             │
+│  │ CSVCompare  │  │  CSVReport  │  │   CSVRow    │             │
+│  │ (diff)      │  │  (generate) │  │  (mixin)    │             │
+│  └─────────────┘  └─────────────┘  └─────────────┘             │
+└─────────────────────────────────────────────────────────────────┘
+```
+## Key Components
+### CSVOptions (Detection)
+Auto-detects CSV file properties by reading the first line:
+- **Column separators**: `\x02`, `\t`, `|`, `,` (checked in order)
+- **Row separators**: `\r\n`, `\n`, `\r`
+- **Byte order marks**: UTF-8, UTF-16, UTF-32
+- **Encoding**: Derived from BOM or defaults to UTF-8
+### CSVWrapper (I/O)
+Resource-safe wrapper around Ruby's CSV class:
+- Tracks whether it opened the file (vs receiving an existing handle)
+- Only closes files it opened (`@close_when_done`)
+- Provides uniform interface for both file paths and CSV objects
+### CSVIterator (I/O)
+Enumerable wrapper for CSV reading:
+- **RowWrapper**: Hash subclass that preserves line numbers for error reporting
+- `each_batch(size)`: Yields rows in configurable batches
+- `to_hash(key, value)`: Builds lookup hash from CSV columns
+- Tracks `prev_row` for error context
+### CSVSort (Processing)
+External merge sort for large files:
+1. **Chunking**: Reads file in batches (default 100,000 rows)
+2. **Sort chunks**: Each batch sorted in memory, written to `.part.N` temp files
+3. **Merge**: Temp files merged pairwise into `.merge.N` files until one remains
+4. **Cleanup**: Temp files deleted, final file moved to destination
+### CSVTransformer (Processing)
+Chainable transformation pipeline:
+- `select(&block)` / `reject(&block)` - Filter rows
+- `map(new_headers, &block)` - Transform rows
+- `append(headers, &block)` - Add columns
+- `additional_data(&block)` - Compute batch-level data accessible to other steps
+- `each(&block)` - Side effects without modification
+- Processes in batches (default 10,000 rows)
+### CSVExtender (Processing)
+Appends columns to existing CSV:
+- `append(headers)` - Row-by-row column addition
+- `append_in_batches(headers, size)` - Batch processing for external lookups
+### CSVCompare (Analysis)
+Compares two **pre-sorted** CSV files:
+- Yields `:create`, `:update`, `:delete` actions
+- Requires a comparison proc for row identity
+- Optional `update_comparison_columns` to detect changes (e.g., `updated_at`)
+- Both files must be sorted by the same key columns
+### CSVReport (Analysis)
+Builds CSV output from objects:
+- Accepts file path or existing CSV object
+- Block-based generation with automatic close
+- Works with CSVRow-enabled objects
+### CSVRow (Mixin)
+Module for defining CSV-serializable objects:
+- `csv_column(name, options, &block)` - Define columns declaratively
+- Uses `inheritance-helper` for inherited column definitions
+- Columns can reference methods or use custom procs
+## CLI Tools
+Standalone executables for CSV debugging:
+| Tool | Purpose |
+|------|---------|
+| `csv-find-error` | Locates malformed CSV errors, shows context |
+| `csv-readline` | Reads specific line numbers |
+| `csv-validator` | Validates CSV structure |
+| `csv-diff` | Compares two CSV files |
+| `csv-grep` | Searches within CSV content |
+| `csv-splitter` | Splits large files into parts |
+| `csv-explorer` | Interactive CSV exploration |
+| `csv-duplicate-finder` | Identifies duplicate rows |
+| `csv-change-eol` | Converts line endings |
+## Data Flow Patterns
+### Comparison Flow (requires pre-sorting)
+```
+primary.csv ──┐
+              ├── CSVCompare ──> :create/:update/:delete actions
+secondary.csv─┘
+```
+### Transformation Flow
+```
+input.csv ──> CSVTransformer ──[select]──[map]──[append]──> output.csv
+```
+### Sort Flow (external merge sort)
+```
+large.csv ──> [chunk & sort] ──> .part.0, .part.1, ...
+                                      │
+                    [pairwise merge] ──┘
+                           │
+                    sorted.csv
+```

data/CLAUDE.md ADDED Viewed

@@ -0,0 +1,63 @@
+# CLAUDE.md
+This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
+## Build and Test Commands
+```bash
+bundle install          # Install dependencies
+bundle exec rspec       # Run all tests
+bundle exec rspec spec/csv_utils/csv_compare_spec.rb  # Run single test file
+bundle exec rubocop     # Run linter
+```
+## Architecture
+This is a Ruby gem (`csv-utils`) providing utilities for manipulating and debugging CSV files, particularly malformed ones.
+### Core Classes (lib/csv_utils/)
+- **CSVOptions** - Auto-detects CSV file properties: column separator, row separator, byte order marks, and encoding. Handles various separators (`\x02`, `\t`, `|`, `,`) and BOMs (UTF-8, UTF-16, UTF-32).
+- **CSVWrapper** - Resource-safe wrapper around Ruby's CSV class that manages file handle lifecycle.
+- **CSVCompare** - Compares two sorted CSV files, yielding `:create`, `:update`, or `:delete` actions.
+- **CSVSort** - Sorts CSV files by specified columns.
+- **CSVTransformer** - Applies row transformations with block-based processing.
+- **CSVExtender** - Extends CSV files with additional columns/data.
+- **CSVReport** - Generates reports from CSV data.
+- **CSVIterator** - Efficient CSV iteration.
+- **CSVRow** - Row-level operations.
+### CLI Tools (bin/)
+Standalone utilities for CSV debugging and manipulation:
+- `csv-find-error` - Locates malformed CSV errors
+- `csv-readline` - Reads specific lines from CSV
+- `csv-validator` - Validates CSV structure
+- `csv-diff` - Compares CSV files
+- `csv-grep` - Searches within CSV files
+- `csv-splitter` - Splits large CSV files
+- `csv-explorer` - Interactive CSV exploration
+- `csv-duplicate-finder` - Finds duplicate rows
+- `csv-change-eol` - Converts line endings
+### Dependencies
+- `csv` - Ruby's standard CSV library
+- `inheritance-helper` - Class inheritance utilities
+## Code Commits
+Format using angular formatting:
+```
+<type>(<scope>): <short summary>
+```
+- **type**: build|ci|docs|feat|fix|perf|refactor|test
+- **scope**: The feature or component of the service we're working on
+- **summary**: Summary in present tense. Not capitalized. No period at the end.
+## Documentation Maintenance
+When modifying the codebase, keep documentation in sync:
+- **ARCHITECTURE.md** - Update when adding/removing classes, changing component relationships, or altering data flow patterns
+- **README.md** - Update when adding new features, changing public APIs, or modifying usage examples
+- **Code comments** - Update inline documentation when changing method signatures or behavior

data/Gemfile CHANGED Viewed

@@ -7,7 +7,8 @@ gem 'inheritance-helper'
 group :development do
   gem 'rake'
-  gem 'rubocop'
   gem 'rspec'
+  gem 'rubocop'
   gem 'simplecov'
+  gem 'simplecov-cobertura'
 end

data/Gemfile.lock CHANGED Viewed

@@ -18,6 +18,7 @@ GEM
     rainbow (3.1.1)
     rake (13.2.1)
     regexp_parser (2.10.0)
+    rexml (3.4.4)
     rspec (3.13.0)
       rspec-core (~> 3.13.0)
       rspec-expectations (~> 3.13.0)
@@ -50,6 +51,9 @@ GEM
       docile (~> 1.1)
       simplecov-html (~> 0.11)
       simplecov_json_formatter (~> 0.1)
+    simplecov-cobertura (3.1.0)
+      rexml
+      simplecov (~> 0.19)
     simplecov-html (0.13.1)
     simplecov_json_formatter (0.1.4)
     unicode-display_width (3.1.4)
@@ -67,6 +71,7 @@ DEPENDENCIES
   rspec
   rubocop
   simplecov
+  simplecov-cobertura
 BUNDLED WITH
    2.6.2

data/README.md CHANGED Viewed

@@ -1,16 +1,22 @@
 # CSV Utils
+[![CI](https://github.com/dougyouch/csv-utils/actions/workflows/ci.yml/badge.svg)](https://github.com/dougyouch/csv-utils/actions/workflows/ci.yml)
+[![codecov](https://codecov.io/gh/dougyouch/csv-utils/graph/badge.svg)](https://codecov.io/gh/dougyouch/csv-utils)
 A Ruby library providing a comprehensive set of utilities for manipulating and processing CSV files. This library offers a robust set of tools for comparing, transforming, sorting, and managing CSV data efficiently.
 ## Features
 - **CSV Comparison**: Compare two CSV files and identify differences (creates, updates, and deletes)
-- **CSV Transformation**: Transform CSV data with customizable rules
-- **CSV Sorting**: Sort CSV files based on specified columns
-- **CSV Reporting**: Generate reports from CSV data
-- **CSV Iteration**: Efficient iteration over CSV files
-- **CSV Extension**: Extend CSV files with additional data
-- **CSV Wrapper**: Convenient wrapper for CSV operations
+- **CSV Transformation**: Transform CSV data with a chainable pipeline
+- **CSV Sorting**: Sort large CSV files using external merge sort
+- **CSV Reporting**: Generate CSV reports from Ruby objects
+- **CSV Row**: Mixin for defining CSV-serializable objects
+- **CSV Row Matcher**: Filter CSV rows using regex patterns across columns
+- **CSV Iteration**: Efficient iteration over CSV files with batch support
+- **CSV Extension**: Extend CSV files with additional columns
+- **CSV Options**: Auto-detect CSV file properties (separators, encoding, BOM)
+- **CLI Tools**: Command-line utilities for CSV debugging and manipulation
 ## Installation
@@ -36,8 +42,10 @@ $ gem install csv-utils
 ### Comparing CSV Files
+Compare two sorted CSV files to identify creates, updates, and deletes:
 ```ruby
-require 'csv_utils'
+require 'csv-utils'
 comparator = CSVUtils::CSVCompare.new('primary.csv', ['updated_at']) do |src, dest|
   src['id'] <=> dest['id']
@@ -55,31 +63,245 @@ comparator.compare('secondary.csv') do |action, record|
 end
 ```
+**Note**: Both CSV files must be sorted by the same key columns for comparison to work correctly.
 ### Sorting CSV Files
+Sort large CSV files using external merge sort:
 ```ruby
-sorter = CSVUtils::CSVSort.new('input.csv')
-sorter.sort('output.csv', ['id', 'name'])
+require 'csv-utils'
+sorter = CSVUtils::CSVSort.new('input.csv', 'output.csv', true)  # true = has headers
+sorter.sort(100_000) { |a, b| a.first.to_i <=> b.first.to_i }    # batch size, comparison block
 ```
 ### Transforming CSV Data
+Transform CSV data using a chainable pipeline:
+```ruby
+require 'csv-utils'
+CSVUtils::CSVTransformer.new('input.csv', 'output.csv')
+  .read_headers
+  .select { |row, headers, _| row[0].to_i > 100 }                    # filter rows
+  .map(['ID', 'Name']) { |row, headers, _| [row[0], row[1].upcase] } # transform rows
+  .append(['Email']) { |row, headers, _| ["#{row[1].downcase}@example.com"] }
+  .process(10_000)  # batch size
+```
+Available pipeline methods:
+- `select { |row, headers, additional_data| }` - Keep rows where block returns true
+- `reject { |row, headers, additional_data| }` - Remove rows where block returns true
+- `map(new_headers) { |row, headers, additional_data| }` - Transform rows
+- `append(additional_headers) { |row, headers, additional_data| }` - Add columns
+- `additional_data { |batch, headers| }` - Compute batch-level data for use in other steps
+- `each { |row, headers, additional_data| }` - Side effects without modification
+- `set_headers(headers)` - Override output headers
+### CSV Row and Report
+Define CSV-serializable objects and generate reports:
 ```ruby
-transformer = CSVUtils::CSVTransformer.new('input.csv')
-transformer.transform('output.csv') do |row|
-  # Transform row data
-  row['new_column'] = row['old_column'].upcase
-  row
+require 'csv-utils'
+class User
+  include CSVUtils::CSVRow
+  attr_accessor :id, :name, :email
+  csv_column :id, header: 'ID'
+  csv_column :name
+  csv_column(:email) { email.downcase }
+  def initialize(id, name, email)
+    @id = id
+    @name = name
+    @email = email
+  end
+end
+users = [
+  User.new(1, 'Alice', 'ALICE@example.com'),
+  User.new(2, 'Bob', 'BOB@example.com')
+]
+# Generate CSV report
+CSVUtils::CSVReport.new('users.csv', User) do |report|
+  users.each { |user| report << user }
 end
 ```
+The `csv_column` method accepts:
+- A symbol referencing a method: `csv_column :name`
+- A custom header: `csv_column :id, header: 'ID'`
+- A block for computed values: `csv_column(:email) { email.downcase }`
+- A proc: `csv_column :count, proc: Proc.new { data[:count] }`
+#### Generating Reports from ActiveRecord Models
+A powerful pattern is to subclass an ActiveRecord model with `CSVRow` for generating reports directly from database records:
+```ruby
+require 'csv-utils'
+class UserCSVRow < User
+  include CSVUtils::CSVRow
+  csv_column :id
+  csv_column :name
+  csv_column :email
+  csv_column :num_orders      # computed column
+  csv_column :total_revenue   # computed column
+  def num_orders
+    orders.count
+  end
+  def total_revenue
+    orders.sum(:amount)
+  end
+  # free up memory during large iterations
+  def clear!
+    @association_cache = {}
+  end
+end
+# Generate report using ActiveRecord scopes
+CSVUtils::CSVReport.new('user_report.csv', UserCSVRow) do |report|
+  UserCSVRow.where(active: true).find_each do |user|
+    report << user
+    user.clear!
+  end
+end
+```
+This pattern provides:
+- **Inherited attributes**: All model columns available without redefinition
+- **Association access**: Query related tables for computed columns
+- **ActiveRecord scopes**: Use `.where`, `.includes`, `.find_each` directly
+- **Memory efficiency**: The `clear!` method frees association cache during iteration
+### Iterating CSV Files
+Efficiently iterate over CSV files:
+```ruby
+require 'csv-utils'
+iterator = CSVUtils::CSVIterator.new('data.csv')
+# Iterate row by row
+iterator.each do |row|
+  puts "Line #{row.lineno}: #{row['name']}"
+end
+# Process in batches
+iterator.each_batch(1_000) do |batch|
+  # Process batch of rows
+end
+# Build a lookup hash
+lookup = iterator.to_hash('id', 'name')  # { 'id_value' => 'name_value', ... }
+```
+### Matching CSV Rows
+Filter CSV rows using regex patterns:
+```ruby
+require 'csv-utils'
+# Match against all columns
+matcher = CSVUtils::CSVRowMatcher.new(/error/i)
+# Or match only specific columns
+matcher = CSVUtils::CSVRowMatcher.new(/error/i, ['status', 'message'])
+# Use with iteration
+iterator = CSVUtils::CSVIterator.new('logs.csv')
+error_rows = iterator.select(&matcher)
+# Use directly
+row = { 'id' => '123', 'status' => 'Error', 'message' => 'Connection failed' }
+matcher.match?(row)  # => true
+```
+The matcher can be used with any Enumerable method via `to_proc`:
+```ruby
+rows.select(&matcher)  # rows matching the pattern
+rows.reject(&matcher)  # rows not matching the pattern
+rows.find(&matcher)    # first matching row
+```
+### Extending CSV Files
+Add columns to an existing CSV:
+```ruby
+require 'csv-utils'
+extender = CSVUtils::CSVExtender.new('input.csv', 'output.csv')
+# Row by row
+extender.append(['new_column']) do |row, headers|
+  [row[0].upcase]  # return array of new column values
+end
+# Or in batches (useful for external lookups)
+extender.append_in_batches(['status'], 1_000) do |batch, headers|
+  # Return array of arrays, one per row in batch
+  batch.map { |row| ['active'] }
+end
+```
+### Auto-detecting CSV Options
+Detect CSV file properties automatically:
+```ruby
+require 'csv-utils'
+options = CSVUtils::CSVOptions.new('data.csv')
+options.valid?         # true if separators detected
+options.col_separator  # detected column separator
+options.row_separator  # detected row separator
+options.encoding       # detected encoding (UTF-8, UTF-16, UTF-32)
+options.columns        # number of columns
+options.byte_order_mark # BOM if present
+```
+Supported column separators: `\x02`, `\t`, `|`, `,`
+Supported row separators: `\r\n`, `\n`, `\r`
+## CLI Tools
+The gem includes command-line utilities for CSV debugging:
+| Command | Description |
+|---------|-------------|
+| `csv-find-error` | Locate malformed CSV errors with context |
+| `csv-readline` | Read specific lines from a CSV file |
+| `csv-validator` | Validate CSV structure |
+| `csv-diff` | Compare two CSV files |
+| `csv-grep` | Search within CSV content |
+| `csv-splitter` | Split large CSV files into parts |
+| `csv-explorer` | Interactive CSV exploration |
+| `csv-duplicate-finder` | Find duplicate rows |
+| `csv-change-eol` | Convert line endings |
 ## Development
-After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests.
+After checking out the repo, run `bundle install` to install dependencies. Then, run `bundle exec rspec` to run the tests.
 ## Contributing
-Bug reports and pull requests are welcome on GitHub at https://github.com/yourusername/csv-utils.
+Bug reports and pull requests are welcome on GitHub at https://github.com/dougyouch/csv-utils.
 ## License