RubyGems - smarter_csv - Versions diffs - 1.12.0.pre1 → 1.12.0 - Mend

smarter_csv 1.12.0.pre1 → 1.12.0

Files changed (17) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +3 -3
data/README.md +2 -2
data/docs/_introduction.md +18 -2
data/docs/basic_api.md +18 -1
data/docs/batch_processing.md +17 -2
data/docs/data_transformations.md +18 -0
data/docs/examples.md +14 -0
data/docs/header_transformations.md +18 -0
data/docs/header_validations.md +18 -0
data/docs/options.md +17 -1
data/docs/row_col_sep.md +17 -0
data/docs/value_converters.md +17 -0
data/lib/smarter_csv/version.rb +1 -1
data/smarter_csv.gemspec +1 -1
metadata +9 -9
data/docs/notes.md +0 -29

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: a7573b21c853accca5035c8ba3b1db8f0d3a7fdddc961a535440adedcf0b6b82
-  data.tar.gz: 19fffb74289999f01210ad359ac349286a8b267df3910846bec81934f6cde333
+  metadata.gz: e37441fcb5fcb55c507df960d4472d085b6b8ab207596e0c723b1c7ed868bb90
+  data.tar.gz: ec554fd545805f48838000446af1749b2adaa4c8e3fb31b3ca146aa3d9b91fad
 SHA512:
-  metadata.gz: 1f9bcb549185941fec0ee7a238df470a8bfdba7cc7ec007057afed0f9dfda8e7a298d1fcfe3bcb2911337827900ccb71df6bd65ed917aed322a3499d4cf3c3a9
-  data.tar.gz: 3350a2d318e351f5d5a192fa1aa0664ed0ba42910c5742c18e4a92cdcd145f0e27ddea928e968e834b50e9ea2906b4b4aa573939540e661b65e11acffa739c0b
+  metadata.gz: 55842abeea7fa20b4811c8d1021a054829abe0dcd9e808e669ebcf8b17457979c66e7bf8110e0a5a07f224e2ca6371b98b1929b678c488f9499a845e733efb17
+  data.tar.gz: 8945d14497a08fef63b7908b10a9a8d483864b065c3b0fdd26497e6826733196fc76fa69c03008d48dbf233d382e552d6d3a56b999536278ebf33f48a5eb0c03

data/CHANGELOG.md CHANGED Viewed

@@ -1,10 +1,10 @@
 # SmarterCSV 1.x Change Log
-## 1.12.0 (2024-07-08)
-  * added SmarterCSV::Reader to process CSV files ([issue #277](https://github.com/tilo/smarter_csv/pull/277))
+## 1.12.0 (2024-07-09)
+  * Added Thread-Safety: added SmarterCSV::Reader to process CSV files in a thread-safe manner ([issue #277](https://github.com/tilo/smarter_csv/pull/277))
   * SmarterCSV::Writer changed default row separator to the system's row separator (`\n` on Linux, `\r\n` on Windows)
-  * added a lot of docs
+  * added a doc tree
   * POTENTIAL ISSUE:

data/README.md CHANGED Viewed

@@ -36,6 +36,7 @@ Or install it yourself as:
   * [Introduction](docs/_introduction.md)
   * [The Basic API](docs/basic_api.md)
+  * [Batch Processing](./docs/batch_processing.md)
   * [Configuration Options](docs/options.md)
   * [Row and Column Separators](docs/row_col_sep.md)
   * [Header Transformations](docs/header_transformations.md)
@@ -43,9 +44,8 @@ Or install it yourself as:
   * [Data Transformations](docs/data_transformations.md)
   * [Value Converters](docs/value_converters.md)
-  * [Notes](docs/notes.md)  <--- this info needs to be moved to individual pages
 # Articles
+* [Parsing CSV Files in Ruby with SmarterCSV](https://tilo-sloboda.medium.com/parsing-csv-files-in-ruby-with-smartercsv-6ce66fb6cf38)
 * [Processing 1.4 Million CSV Records in Ruby, fast ](https://lcx.wien/blog/processing-14-million-csv-records-in-ruby/)
 * [Speeding up CSV parsing with parallel processing](http://xjlin0.github.io/tech/2015/05/25/faster-parsing-csv-with-parallel-processing)
 * [The original post](http://www.unixgods.org/Ruby/process_csv_as_hashes.html) that started SmarterCSV

data/docs/_introduction.md CHANGED Viewed

@@ -1,8 +1,21 @@
-# SmarterCSV Introduction
+### Contents
+  * [**Introduction**](./_introduction.md)
+  * [The Basic API](./basic_api.md)
+  * [Batch Processing](././batch_processing.md)
+  * [Configuration Options](./options.md)
+  * [Row and Column Separators](./row_col_sep.md)
+  * [Header Transformations](./header_transformations.md)
+  * [Header Validations](./header_validations.md)
+  * [Data Transformations](./data_transformations.md)
+  * [Value Converters](./value_converters.md)
+--------------
-`smarter_csv` is a Ruby Gem for smarter importing of CSV Files as Array(s) of Hashes, suitable for direct processing with ActiveRecord, parallel processing, kicking-off batch jobs with Sidekiq, or oploading data to S3.
+# SmarterCSV Introduction
+`smarter_csv` is a Ruby Gem for convenient reading and writing of CSV files. It has intelligent defaults, and auto-discovery of column and row separators. It imports CSV Files as Array(s) of Hashes, suitable for direct processing with ActiveRecord, kicking-off batch jobs with Sidekiq, parallel processing, or oploading data to S3. Similarly, writing CSV files takes Hashes, or Arrays of Hashes to create a CSV file.
 ## Why another CSV library?
@@ -38,3 +51,6 @@ The CSV processing also needed to be robust against variations in the input data
 * Data Validations
   (planned feature)
+---------------
+PREVIOUS [README](../README.md) | NEXT: [The Basic API](./basic_api.md)

data/docs/basic_api.md CHANGED Viewed

@@ -1,5 +1,19 @@
-# SmarterCSV API
+### Contents
+  * [Introduction](./_introduction.md)
+  * [**The Basic API**](./basic_api.md)
+  * [Batch Processing](././batch_processing.md)
+  * [Configuration Options](./options.md)
+  * [Row and Column Separators](./row_col_sep.md)
+  * [Header Transformations](./header_transformations.md)
+  * [Header Validations](./header_validations.md)
+  * [Data Transformations](./data_transformations.md)
+  * [Value Converters](./value_converters.md)
+--------------
+# SmarterCSV Basic API
 Let's explore the basic APIs for reading and writing CSV files. There is a simplified API (backwards conpatible with previous SmarterCSV versions) and the full API, which allows you to access the internal state of the reader or writer instance after processing.
@@ -138,3 +152,6 @@ $ hexdump -C spec/fixtures/bom_test_feff.csv
          data = SmarterCSV.process(f)
        end
 ```
+----------------
+PREVIOUS: [Introduction](./_introduction.md) | NEXT: [Batch Processing](./batch_processing.md)

data/docs/batch_processing.md CHANGED Viewed

@@ -1,4 +1,18 @@
+### Contents
+  * [Introduction](./_introduction.md)
+  * [The Basic API](./basic_api.md)
+  * [**Batch Processing**](././batch_processing.md)
+  * [Configuration Options](./options.md)
+  * [Row and Column Separators](./row_col_sep.md)
+  * [Header Transformations](./header_transformations.md)
+  * [Header Validations](./header_validations.md)
+  * [Data Transformations](./data_transformations.md)
+  * [Value Converters](./value_converters.md)
+--------------
 # Batch Processing
 Processing CSV data in batches (chunks), allows you to parallelize the workload of importing data.
@@ -44,10 +58,11 @@ and how the `process` method returns the number of chunks when called with a blo
     n = SmarterCSV.process(filename, options) do |chunk|
           # we're passing a block in, to process each resulting hash / row (block takes array of hashes)
           # when chunking is enabled, there are up to :chunk_size hashes in each chunk
-          MyModel.collection.insert( chunk )   # insert up to 100 records at a time
+          MyModel.insert_all( chunk )   # insert up to 100 records at a time
     end
      => returns number of chunks we processed
 ```
+----------------
+PREVIOUS: [The Basic API](./basic_api.md)  | NEXT: [Configuration Options](./options.md)

data/docs/data_transformations.md CHANGED Viewed

@@ -1,3 +1,18 @@
+### Contents
+  * [Introduction](./_introduction.md)
+  * [The Basic API](./basic_api.md)
+  * [Batch Processing](././batch_processing.md)
+  * [Configuration Options](./options.md)
+  * [Row and Column Separators](./row_col_sep.md)
+  * [Header Transformations](./header_transformations.md)
+  * [Header Validations](./header_validations.md)
+  * [**Data Transformations**](./data_transformations.md)
+  * [Value Converters](./value_converters.md)
+--------------
 # Data Transformations
 SmarterCSV automatically transforms the values in each colum in order to normalize the data.
@@ -30,3 +45,6 @@ It can happen that after all transformations, a row of the CSV file would produc
 By default SmarterCSV uses `remove_empty_hashes: true` to remove these empty hashes from the result.
 This can be set to `true`, to keep these empty hashes in the results.
+-------------------
+PREVIOUS: [Header Validations](./header_validations.md) | NEXT: [Value Converters](./value_converters.md)

data/docs/examples.md CHANGED Viewed

@@ -1,4 +1,18 @@
+### Contents
+  * [Introduction](./_introduction.md)
+  * [The Basic API](./basic_api.md)
+  * [Batch Processing](././batch_processing.md)
+  * [Configuration Options](./options.md)
+  * [Row and Column Separators](./row_col_sep.md)
+  * [Header Transformations](./header_transformations.md)
+  * [Header Validations](./header_validations.md)
+  * [Data Transformations](./data_transformations.md)
+  * [Value Converters](./value_converters.md)
+--------------
 # Examples
 Here are some examples to demonstrate the versatility of SmarterCSV.

data/docs/header_transformations.md CHANGED Viewed

@@ -1,3 +1,18 @@
+### Contents
+  * [Introduction](./_introduction.md)
+  * [The Basic API](./basic_api.md)
+  * [Batch Processing](././batch_processing.md)
+  * [Configuration Options](./options.md)
+  * [Row and Column Separators](./row_col_sep.md)
+  * [**Header Transformations**](./header_transformations.md)
+  * [Header Validations](./header_validations.md)
+  * [Data Transformations](./data_transformations.md)
+  * [Value Converters](./value_converters.md)
+--------------
 # Header Transformations
 By default SmarterCSV assumes that a CSV file has headers, and it automatically normalizes the headers and transforms them into Ruby symbols. You can completely customize or override this (see below).
@@ -93,3 +108,6 @@ For CSV files with headers, you can either:
  * some CSV files use un-escaped quotation characters inside fields. This can cause the import to break. To get around this, use the `:force_simple_split => true` option in combination with `:strip_chars_from_headers => /[\-"]/` . This will also significantly speed up the import.
    If you would force a different :quote_char instead (setting it to a non-used character), then the import would be up to 5-times slower than using `:force_simple_split`.
+---------------
+PREVIOUS: [Row and Column Separators](./row_col_sep.md) | NEXT: [Header Validations](./header_validations.md)

data/docs/header_validations.md CHANGED Viewed

@@ -1,3 +1,18 @@
+### Contents
+  * [Introduction](./_introduction.md)
+  * [The Basic API](./basic_api.md)
+  * [Batch Processing](././batch_processing.md)
+  * [Configuration Options](./options.md)
+  * [Row and Column Separators](./row_col_sep.md)
+  * [Header Transformations](./header_transformations.md)
+  * [**Header Validations**](./header_validations.md)
+  * [Data Transformations](./data_transformations.md)
+  * [Value Converters](./value_converters.md)
+--------------
 # Header Validations
 When you are importing data, it can be important to verify that all required data is present, to ensure consistent quality when importing data.
@@ -16,3 +31,6 @@ If these keys are not present, `SmarterCSV::MissingKeys` will be raised to infor
   => this will raise SmarterCSV::MissingKeys if any row does not contain these three keys
 ```
+----------------
+PREVIOUS: [Header Transformations](./header_transformations.md) | NEXT: [Data Transformations](./data_transformations.md)

data/docs/options.md CHANGED Viewed

@@ -1,5 +1,19 @@
-# SmarterCSV Options
+### Contents
+  * [Introduction](./_introduction.md)
+  * [The Basic API](./basic_api.md)
+  * [Batch Processing](././batch_processing.md)
+  * [**Configuration Options**](./options.md)
+  * [Row and Column Separators](./row_col_sep.md)
+  * [Header Transformations](./header_transformations.md)
+  * [Header Validations](./header_validations.md)
+  * [Data Transformations](./data_transformations.md)
+  * [Value Converters](./value_converters.md)
+--------------
+# Configuration Options
 ## CSV Writing
@@ -80,3 +94,5 @@ There have been a lot of 1-offs and feature creep around these options, and goin
      |                             |          |      also accepts either {:except => [:key1,:key2]} or {:only => :key3}              |
      ---------------------------------------------------------------------------------------------------------------------------------
+-------------
+PREVIOUS: [Batch Processing](./batch_processing.md) | NEXT: [Row and Column Separators](./row_col_sep.md)

data/docs/row_col_sep.md CHANGED Viewed

@@ -1,4 +1,18 @@
+### Contents
+  * [Introduction](./_introduction.md)
+  * [The Basic API](./basic_api.md)
+  * [Batch Processing](././batch_processing.md)
+  * [Configuration Options](./options.md)
+  * [**Row and Column Separators**](./row_col_sep.md)
+  * [Header Transformations](./header_transformations.md)
+  * [Header Validations](./header_validations.md)
+  * [Data Transformations](./data_transformations.md)
+  * [Value Converters](./value_converters.md)
+--------------
 # Row and Column Separators
 ## Automatic Detection
@@ -85,3 +99,6 @@ In this example, we use `comment_regexp` to filter out and ignore any lines star
     end
     => returns number of chunks
 ```
+----------------
+PREVIOUS: [Configuration Options](./options.md) | NEXT: [Header Transformations](./header_transformations.md)

data/docs/value_converters.md CHANGED Viewed

@@ -1,4 +1,18 @@
+### Contents
+  * [Introduction](./_introduction.md)
+  * [The Basic API](./basic_api.md)
+  * [Batch Processing](././batch_processing.md)
+  * [Configuration Options](./options.md)
+  * [Row and Column Separators](./row_col_sep.md)
+  * [Header Transformations](./header_transformations.md)
+  * [Header Validations](./header_validations.md)
+  * [Data Transformations](./data_transformations.md)
+  * [**Value Converters**](./value_converters.md)
+--------------
 # Using Value Converters
 Value Converters allow you to do custom transformations specific rows, to help you massage the data so it fits the expectations of your down-stream process, such as creating a DB record.
@@ -49,3 +63,6 @@ If you use `key_mappings` and `value_converters`, make sure that the value conve
     first_record[:price].class
       => Float
 ```
+--------------------
+PREVIOUS: [Data Transformations](./data_transformations.md) | UP: [README](../README.md)

data/lib/smarter_csv/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module SmarterCSV
-  VERSION = "1.12.0.pre1"
+  VERSION = "1.12.0"
 end

data/smarter_csv.gemspec CHANGED Viewed

@@ -10,7 +10,7 @@ Gem::Specification.new do |spec|
   spec.email         = ["tilo.sloboda@gmail.com"]
   spec.summary       = "Convenient CSV Reading and Writing"
-  spec.description   = "Ruby Gem for convenient reading and writing: importing of CSV Files as Array(s) of Hashes, with lots of features for processing large files in parallel, embedded comments, unusual field- and record-separators, flexible mapping of CSV-headers to Hash-keys"
+  spec.description   = "Ruby Gem for convenient reading and writing of CSV files. It has intelligent defaults, and auto-discovery of column and row separators. It imports CSV Files as Array(s) of Hashes, suitable for direct processing with ActiveRecord, kicking-off batch jobs with Sidekiq, parallel processing, or oploading data to S3. Similarly, writing CSV files takes Hashes, or Arrays of Hashes to create a CSV file."
   spec.homepage      = "https://github.com/tilo/smarter_csv"
   spec.license       = 'MIT'

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: smarter_csv
 version: !ruby/object:Gem::Version
-  version: 1.12.0.pre1
+  version: 1.12.0
 platform: ruby
 authors:
 - Tilo Sloboda
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2024-07-08 00:00:00.000000000 Z
+date: 2024-07-10 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: awesome_print
@@ -94,10 +94,11 @@ dependencies:
     - - ">="
       - !ruby/object:Gem::Version
         version: '0'
-description: 'Ruby Gem for convenient reading and writing: importing of CSV Files
-  as Array(s) of Hashes, with lots of features for processing large files in parallel,
-  embedded comments, unusual field- and record-separators, flexible mapping of CSV-headers
-  to Hash-keys'
+description: Ruby Gem for convenient reading and writing of CSV files. It has intelligent
+  defaults, and auto-discovery of column and row separators. It imports CSV Files
+  as Array(s) of Hashes, suitable for direct processing with ActiveRecord, kicking-off
+  batch jobs with Sidekiq, parallel processing, or oploading data to S3. Similarly,
+  writing CSV files takes Hashes, or Arrays of Hashes to create a CSV file.
 email:
 - tilo.sloboda@gmail.com
 executables: []
@@ -122,7 +123,6 @@ files:
 - docs/examples.md
 - docs/header_transformations.md
 - docs/header_validations.md
-- docs/notes.md
 - docs/options.md
 - docs/row_col_sep.md
 - docs/value_converters.md
@@ -161,9 +161,9 @@ required_ruby_version: !ruby/object:Gem::Requirement
       version: 2.5.0
 required_rubygems_version: !ruby/object:Gem::Requirement
   requirements:
-  - - ">"
+  - - ">="
     - !ruby/object:Gem::Version
-      version: 1.3.1
+      version: '0'
 requirements: []
 rubygems_version: 3.2.3
 signing_key:

data/docs/notes.md DELETED Viewed

@@ -1,29 +0,0 @@
-# Notes
-## NOTES on the use of Chunking and Blocks:
- * chunking can be VERY USEFUL if used in combination with passing a block to File.read_csv FOR LARGE FILES
- * if you pass a block to File.read_csv, that block will be executed and given an Array of Hashes as the parameter.
- * if the chunk_size is not set, then the array will only contain one Hash.
- * if the chunk_size is > 0 , then the array may contain up to chunk_size Hashes.
- * this can be very useful when passing chunked data to a post-processing step, e.g. through Sidekiq
-## NOTES about File Encodings:
- * if you have a CSV file which contains unicode characters, you can process it as follows:
-```ruby
-       File.open(filename, "r:bom|utf-8") do |f|
-         data = SmarterCSV.process(f);
-       end
-```
-* if the CSV file with unicode characters is in a remote location, similarly you need to give the encoding as an option to the `open` call:
-```ruby
-       require 'open-uri'
-       file_location = 'http://your.remote.org/sample.csv'
-       open(file_location, 'r:utf-8') do |f|   # don't forget to specify the UTF-8 encoding!!
-         data = SmarterCSV.process(f)
-       end
-```