smarter_csv 1.12.0.pre1 → 1.12.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +6 -3
- data/CONTRIBUTORS.md +1 -0
- data/README.md +2 -2
- data/docs/_introduction.md +18 -2
- data/docs/basic_api.md +18 -1
- data/docs/batch_processing.md +17 -2
- data/docs/data_transformations.md +18 -0
- data/docs/examples.md +14 -0
- data/docs/header_transformations.md +18 -0
- data/docs/header_validations.md +18 -0
- data/docs/options.md +17 -1
- data/docs/row_col_sep.md +17 -0
- data/docs/value_converters.md +17 -0
- data/lib/smarter_csv/auto_detection.rb +6 -1
- data/lib/smarter_csv/version.rb +1 -1
- data/smarter_csv.gemspec +1 -1
- metadata +9 -9
- data/docs/notes.md +0 -29
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 05aa9e7d2d22ec6e1beb3790e2b727cd3e615cadcd537716f2dfbb190cc87a09
|
4
|
+
data.tar.gz: e37b072c7c81a3b6cdc6192ed2bfab046c924f3aa7a8a3e2a66f55fafa25b7ff
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 07c149aaa123ef75fb65fd596fbab64359e24cf2b8606fe406d714358a1c14696fa9ecb420e6dd0a95d40f6af6d41e4988b16df9eac4346d9e1295e3c32f22b1
|
7
|
+
data.tar.gz: 71341c1cf1092fabbfe9106ce533adb872e2bc1b0c30fbc032f3ceaea1832e2ddef5d4156f1465658a67dddaae508cd23b12cfe9fdf34edea3f1f3ede0385688
|
data/CHANGELOG.md
CHANGED
@@ -1,10 +1,13 @@
|
|
1
1
|
|
2
2
|
# SmarterCSV 1.x Change Log
|
3
3
|
|
4
|
-
## 1.12.
|
5
|
-
*
|
4
|
+
## 1.12.1 (2024-07-10)
|
5
|
+
* Improved column separator detection by ignoring quoted sections [#276](https://github.com/tilo/smarter_csv/pull/276) (thanks to Nicolas Castellanos)
|
6
|
+
|
7
|
+
## 1.12.0 (2024-07-09)
|
8
|
+
* Added Thread-Safety: added SmarterCSV::Reader to process CSV files in a thread-safe manner ([issue #277](https://github.com/tilo/smarter_csv/pull/277))
|
6
9
|
* SmarterCSV::Writer changed default row separator to the system's row separator (`\n` on Linux, `\r\n` on Windows)
|
7
|
-
* added a
|
10
|
+
* added a doc tree
|
8
11
|
|
9
12
|
* POTENTIAL ISSUE:
|
10
13
|
|
data/CONTRIBUTORS.md
CHANGED
@@ -53,3 +53,4 @@ A Big Thank you to everyone who filed issues, sent comments, and who contributed
|
|
53
53
|
* [JP Camara](https://github.com/jpcamara)
|
54
54
|
* [Kenton Hirowatari](https://github.com/hirowatari)
|
55
55
|
* [Daniel Pepper](https://github.com/dpep)
|
56
|
+
* [Nicolas Castellanos](https://github.com/nicastelo)
|
data/README.md
CHANGED
@@ -36,6 +36,7 @@ Or install it yourself as:
|
|
36
36
|
|
37
37
|
* [Introduction](docs/_introduction.md)
|
38
38
|
* [The Basic API](docs/basic_api.md)
|
39
|
+
* [Batch Processing](./docs/batch_processing.md)
|
39
40
|
* [Configuration Options](docs/options.md)
|
40
41
|
* [Row and Column Separators](docs/row_col_sep.md)
|
41
42
|
* [Header Transformations](docs/header_transformations.md)
|
@@ -43,9 +44,8 @@ Or install it yourself as:
|
|
43
44
|
* [Data Transformations](docs/data_transformations.md)
|
44
45
|
* [Value Converters](docs/value_converters.md)
|
45
46
|
|
46
|
-
* [Notes](docs/notes.md) <--- this info needs to be moved to individual pages
|
47
|
-
|
48
47
|
# Articles
|
48
|
+
* [Parsing CSV Files in Ruby with SmarterCSV](https://tilo-sloboda.medium.com/parsing-csv-files-in-ruby-with-smartercsv-6ce66fb6cf38)
|
49
49
|
* [Processing 1.4 Million CSV Records in Ruby, fast ](https://lcx.wien/blog/processing-14-million-csv-records-in-ruby/)
|
50
50
|
* [Speeding up CSV parsing with parallel processing](http://xjlin0.github.io/tech/2015/05/25/faster-parsing-csv-with-parallel-processing)
|
51
51
|
* [The original post](http://www.unixgods.org/Ruby/process_csv_as_hashes.html) that started SmarterCSV
|
data/docs/_introduction.md
CHANGED
@@ -1,8 +1,21 @@
|
|
1
1
|
|
2
|
-
|
2
|
+
### Contents
|
3
|
+
|
4
|
+
* [**Introduction**](./_introduction.md)
|
5
|
+
* [The Basic API](./basic_api.md)
|
6
|
+
* [Batch Processing](././batch_processing.md)
|
7
|
+
* [Configuration Options](./options.md)
|
8
|
+
* [Row and Column Separators](./row_col_sep.md)
|
9
|
+
* [Header Transformations](./header_transformations.md)
|
10
|
+
* [Header Validations](./header_validations.md)
|
11
|
+
* [Data Transformations](./data_transformations.md)
|
12
|
+
* [Value Converters](./value_converters.md)
|
13
|
+
|
14
|
+
--------------
|
3
15
|
|
4
|
-
|
16
|
+
# SmarterCSV Introduction
|
5
17
|
|
18
|
+
`smarter_csv` is a Ruby Gem for convenient reading and writing of CSV files. It has intelligent defaults, and auto-discovery of column and row separators. It imports CSV Files as Array(s) of Hashes, suitable for direct processing with ActiveRecord, kicking-off batch jobs with Sidekiq, parallel processing, or oploading data to S3. Similarly, writing CSV files takes Hashes, or Arrays of Hashes to create a CSV file.
|
6
19
|
|
7
20
|
## Why another CSV library?
|
8
21
|
|
@@ -38,3 +51,6 @@ The CSV processing also needed to be robust against variations in the input data
|
|
38
51
|
|
39
52
|
* Data Validations
|
40
53
|
(planned feature)
|
54
|
+
|
55
|
+
---------------
|
56
|
+
PREVIOUS [README](../README.md) | NEXT: [The Basic API](./basic_api.md)
|
data/docs/basic_api.md
CHANGED
@@ -1,5 +1,19 @@
|
|
1
1
|
|
2
|
-
|
2
|
+
### Contents
|
3
|
+
|
4
|
+
* [Introduction](./_introduction.md)
|
5
|
+
* [**The Basic API**](./basic_api.md)
|
6
|
+
* [Batch Processing](././batch_processing.md)
|
7
|
+
* [Configuration Options](./options.md)
|
8
|
+
* [Row and Column Separators](./row_col_sep.md)
|
9
|
+
* [Header Transformations](./header_transformations.md)
|
10
|
+
* [Header Validations](./header_validations.md)
|
11
|
+
* [Data Transformations](./data_transformations.md)
|
12
|
+
* [Value Converters](./value_converters.md)
|
13
|
+
|
14
|
+
--------------
|
15
|
+
|
16
|
+
# SmarterCSV Basic API
|
3
17
|
|
4
18
|
Let's explore the basic APIs for reading and writing CSV files. There is a simplified API (backwards conpatible with previous SmarterCSV versions) and the full API, which allows you to access the internal state of the reader or writer instance after processing.
|
5
19
|
|
@@ -138,3 +152,6 @@ $ hexdump -C spec/fixtures/bom_test_feff.csv
|
|
138
152
|
data = SmarterCSV.process(f)
|
139
153
|
end
|
140
154
|
```
|
155
|
+
|
156
|
+
----------------
|
157
|
+
PREVIOUS: [Introduction](./_introduction.md) | NEXT: [Batch Processing](./batch_processing.md)
|
data/docs/batch_processing.md
CHANGED
@@ -1,4 +1,18 @@
|
|
1
1
|
|
2
|
+
### Contents
|
3
|
+
|
4
|
+
* [Introduction](./_introduction.md)
|
5
|
+
* [The Basic API](./basic_api.md)
|
6
|
+
* [**Batch Processing**](././batch_processing.md)
|
7
|
+
* [Configuration Options](./options.md)
|
8
|
+
* [Row and Column Separators](./row_col_sep.md)
|
9
|
+
* [Header Transformations](./header_transformations.md)
|
10
|
+
* [Header Validations](./header_validations.md)
|
11
|
+
* [Data Transformations](./data_transformations.md)
|
12
|
+
* [Value Converters](./value_converters.md)
|
13
|
+
|
14
|
+
--------------
|
15
|
+
|
2
16
|
# Batch Processing
|
3
17
|
|
4
18
|
Processing CSV data in batches (chunks), allows you to parallelize the workload of importing data.
|
@@ -44,10 +58,11 @@ and how the `process` method returns the number of chunks when called with a blo
|
|
44
58
|
n = SmarterCSV.process(filename, options) do |chunk|
|
45
59
|
# we're passing a block in, to process each resulting hash / row (block takes array of hashes)
|
46
60
|
# when chunking is enabled, there are up to :chunk_size hashes in each chunk
|
47
|
-
MyModel.
|
61
|
+
MyModel.insert_all( chunk ) # insert up to 100 records at a time
|
48
62
|
end
|
49
63
|
|
50
64
|
=> returns number of chunks we processed
|
51
65
|
```
|
52
66
|
|
53
|
-
|
67
|
+
----------------
|
68
|
+
PREVIOUS: [The Basic API](./basic_api.md) | NEXT: [Configuration Options](./options.md)
|
@@ -1,3 +1,18 @@
|
|
1
|
+
|
2
|
+
### Contents
|
3
|
+
|
4
|
+
* [Introduction](./_introduction.md)
|
5
|
+
* [The Basic API](./basic_api.md)
|
6
|
+
* [Batch Processing](././batch_processing.md)
|
7
|
+
* [Configuration Options](./options.md)
|
8
|
+
* [Row and Column Separators](./row_col_sep.md)
|
9
|
+
* [Header Transformations](./header_transformations.md)
|
10
|
+
* [Header Validations](./header_validations.md)
|
11
|
+
* [**Data Transformations**](./data_transformations.md)
|
12
|
+
* [Value Converters](./value_converters.md)
|
13
|
+
|
14
|
+
--------------
|
15
|
+
|
1
16
|
# Data Transformations
|
2
17
|
|
3
18
|
SmarterCSV automatically transforms the values in each colum in order to normalize the data.
|
@@ -30,3 +45,6 @@ It can happen that after all transformations, a row of the CSV file would produc
|
|
30
45
|
By default SmarterCSV uses `remove_empty_hashes: true` to remove these empty hashes from the result.
|
31
46
|
|
32
47
|
This can be set to `true`, to keep these empty hashes in the results.
|
48
|
+
|
49
|
+
-------------------
|
50
|
+
PREVIOUS: [Header Validations](./header_validations.md) | NEXT: [Value Converters](./value_converters.md)
|
data/docs/examples.md
CHANGED
@@ -1,4 +1,18 @@
|
|
1
1
|
|
2
|
+
### Contents
|
3
|
+
|
4
|
+
* [Introduction](./_introduction.md)
|
5
|
+
* [The Basic API](./basic_api.md)
|
6
|
+
* [Batch Processing](././batch_processing.md)
|
7
|
+
* [Configuration Options](./options.md)
|
8
|
+
* [Row and Column Separators](./row_col_sep.md)
|
9
|
+
* [Header Transformations](./header_transformations.md)
|
10
|
+
* [Header Validations](./header_validations.md)
|
11
|
+
* [Data Transformations](./data_transformations.md)
|
12
|
+
* [Value Converters](./value_converters.md)
|
13
|
+
|
14
|
+
--------------
|
15
|
+
|
2
16
|
# Examples
|
3
17
|
|
4
18
|
Here are some examples to demonstrate the versatility of SmarterCSV.
|
@@ -1,3 +1,18 @@
|
|
1
|
+
|
2
|
+
### Contents
|
3
|
+
|
4
|
+
* [Introduction](./_introduction.md)
|
5
|
+
* [The Basic API](./basic_api.md)
|
6
|
+
* [Batch Processing](././batch_processing.md)
|
7
|
+
* [Configuration Options](./options.md)
|
8
|
+
* [Row and Column Separators](./row_col_sep.md)
|
9
|
+
* [**Header Transformations**](./header_transformations.md)
|
10
|
+
* [Header Validations](./header_validations.md)
|
11
|
+
* [Data Transformations](./data_transformations.md)
|
12
|
+
* [Value Converters](./value_converters.md)
|
13
|
+
|
14
|
+
--------------
|
15
|
+
|
1
16
|
# Header Transformations
|
2
17
|
|
3
18
|
By default SmarterCSV assumes that a CSV file has headers, and it automatically normalizes the headers and transforms them into Ruby symbols. You can completely customize or override this (see below).
|
@@ -93,3 +108,6 @@ For CSV files with headers, you can either:
|
|
93
108
|
* some CSV files use un-escaped quotation characters inside fields. This can cause the import to break. To get around this, use the `:force_simple_split => true` option in combination with `:strip_chars_from_headers => /[\-"]/` . This will also significantly speed up the import.
|
94
109
|
If you would force a different :quote_char instead (setting it to a non-used character), then the import would be up to 5-times slower than using `:force_simple_split`.
|
95
110
|
|
111
|
+
---------------
|
112
|
+
PREVIOUS: [Row and Column Separators](./row_col_sep.md) | NEXT: [Header Validations](./header_validations.md)
|
113
|
+
|
data/docs/header_validations.md
CHANGED
@@ -1,3 +1,18 @@
|
|
1
|
+
|
2
|
+
### Contents
|
3
|
+
|
4
|
+
* [Introduction](./_introduction.md)
|
5
|
+
* [The Basic API](./basic_api.md)
|
6
|
+
* [Batch Processing](././batch_processing.md)
|
7
|
+
* [Configuration Options](./options.md)
|
8
|
+
* [Row and Column Separators](./row_col_sep.md)
|
9
|
+
* [Header Transformations](./header_transformations.md)
|
10
|
+
* [**Header Validations**](./header_validations.md)
|
11
|
+
* [Data Transformations](./data_transformations.md)
|
12
|
+
* [Value Converters](./value_converters.md)
|
13
|
+
|
14
|
+
--------------
|
15
|
+
|
1
16
|
# Header Validations
|
2
17
|
|
3
18
|
When you are importing data, it can be important to verify that all required data is present, to ensure consistent quality when importing data.
|
@@ -16,3 +31,6 @@ If these keys are not present, `SmarterCSV::MissingKeys` will be raised to infor
|
|
16
31
|
|
17
32
|
=> this will raise SmarterCSV::MissingKeys if any row does not contain these three keys
|
18
33
|
```
|
34
|
+
|
35
|
+
----------------
|
36
|
+
PREVIOUS: [Header Transformations](./header_transformations.md) | NEXT: [Data Transformations](./data_transformations.md)
|
data/docs/options.md
CHANGED
@@ -1,5 +1,19 @@
|
|
1
1
|
|
2
|
-
|
2
|
+
### Contents
|
3
|
+
|
4
|
+
* [Introduction](./_introduction.md)
|
5
|
+
* [The Basic API](./basic_api.md)
|
6
|
+
* [Batch Processing](././batch_processing.md)
|
7
|
+
* [**Configuration Options**](./options.md)
|
8
|
+
* [Row and Column Separators](./row_col_sep.md)
|
9
|
+
* [Header Transformations](./header_transformations.md)
|
10
|
+
* [Header Validations](./header_validations.md)
|
11
|
+
* [Data Transformations](./data_transformations.md)
|
12
|
+
* [Value Converters](./value_converters.md)
|
13
|
+
|
14
|
+
--------------
|
15
|
+
|
16
|
+
# Configuration Options
|
3
17
|
|
4
18
|
## CSV Writing
|
5
19
|
|
@@ -80,3 +94,5 @@ There have been a lot of 1-offs and feature creep around these options, and goin
|
|
80
94
|
| | | also accepts either {:except => [:key1,:key2]} or {:only => :key3} |
|
81
95
|
---------------------------------------------------------------------------------------------------------------------------------
|
82
96
|
|
97
|
+
-------------
|
98
|
+
PREVIOUS: [Batch Processing](./batch_processing.md) | NEXT: [Row and Column Separators](./row_col_sep.md)
|
data/docs/row_col_sep.md
CHANGED
@@ -1,4 +1,18 @@
|
|
1
1
|
|
2
|
+
### Contents
|
3
|
+
|
4
|
+
* [Introduction](./_introduction.md)
|
5
|
+
* [The Basic API](./basic_api.md)
|
6
|
+
* [Batch Processing](././batch_processing.md)
|
7
|
+
* [Configuration Options](./options.md)
|
8
|
+
* [**Row and Column Separators**](./row_col_sep.md)
|
9
|
+
* [Header Transformations](./header_transformations.md)
|
10
|
+
* [Header Validations](./header_validations.md)
|
11
|
+
* [Data Transformations](./data_transformations.md)
|
12
|
+
* [Value Converters](./value_converters.md)
|
13
|
+
|
14
|
+
--------------
|
15
|
+
|
2
16
|
# Row and Column Separators
|
3
17
|
|
4
18
|
## Automatic Detection
|
@@ -85,3 +99,6 @@ In this example, we use `comment_regexp` to filter out and ignore any lines star
|
|
85
99
|
end
|
86
100
|
=> returns number of chunks
|
87
101
|
```
|
102
|
+
|
103
|
+
----------------
|
104
|
+
PREVIOUS: [Configuration Options](./options.md) | NEXT: [Header Transformations](./header_transformations.md)
|
data/docs/value_converters.md
CHANGED
@@ -1,4 +1,18 @@
|
|
1
1
|
|
2
|
+
### Contents
|
3
|
+
|
4
|
+
* [Introduction](./_introduction.md)
|
5
|
+
* [The Basic API](./basic_api.md)
|
6
|
+
* [Batch Processing](././batch_processing.md)
|
7
|
+
* [Configuration Options](./options.md)
|
8
|
+
* [Row and Column Separators](./row_col_sep.md)
|
9
|
+
* [Header Transformations](./header_transformations.md)
|
10
|
+
* [Header Validations](./header_validations.md)
|
11
|
+
* [Data Transformations](./data_transformations.md)
|
12
|
+
* [**Value Converters**](./value_converters.md)
|
13
|
+
|
14
|
+
--------------
|
15
|
+
|
2
16
|
# Using Value Converters
|
3
17
|
|
4
18
|
Value Converters allow you to do custom transformations specific rows, to help you massage the data so it fits the expectations of your down-stream process, such as creating a DB record.
|
@@ -49,3 +63,6 @@ If you use `key_mappings` and `value_converters`, make sure that the value conve
|
|
49
63
|
first_record[:price].class
|
50
64
|
=> Float
|
51
65
|
```
|
66
|
+
|
67
|
+
--------------------
|
68
|
+
PREVIOUS: [Data Transformations](./data_transformations.md) | UP: [README](../README.md)
|
@@ -19,7 +19,12 @@ module SmarterCSV
|
|
19
19
|
count.times do
|
20
20
|
line = readline_with_counts(filehandle, options)
|
21
21
|
delimiters.each do |d|
|
22
|
-
|
22
|
+
escaped_quote = Regexp.escape(options[:quote_char])
|
23
|
+
|
24
|
+
# Count only non-quoted occurrences of the delimiter
|
25
|
+
non_quoted_text = line.split(/#{escaped_quote}[^#{escaped_quote}]*#{escaped_quote}/).join
|
26
|
+
|
27
|
+
candidates[d] += non_quoted_text.scan(d).count
|
23
28
|
end
|
24
29
|
rescue EOFError # short files
|
25
30
|
break
|
data/lib/smarter_csv/version.rb
CHANGED
data/smarter_csv.gemspec
CHANGED
@@ -10,7 +10,7 @@ Gem::Specification.new do |spec|
|
|
10
10
|
spec.email = ["tilo.sloboda@gmail.com"]
|
11
11
|
|
12
12
|
spec.summary = "Convenient CSV Reading and Writing"
|
13
|
-
spec.description = "Ruby Gem for convenient reading and writing
|
13
|
+
spec.description = "Ruby Gem for convenient reading and writing of CSV files. It has intelligent defaults, and auto-discovery of column and row separators. It imports CSV Files as Array(s) of Hashes, suitable for direct processing with ActiveRecord, kicking-off batch jobs with Sidekiq, parallel processing, or oploading data to S3. Similarly, writing CSV files takes Hashes, or Arrays of Hashes to create a CSV file."
|
14
14
|
spec.homepage = "https://github.com/tilo/smarter_csv"
|
15
15
|
spec.license = 'MIT'
|
16
16
|
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: smarter_csv
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.12.
|
4
|
+
version: 1.12.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Tilo Sloboda
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2024-07-
|
11
|
+
date: 2024-07-10 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: awesome_print
|
@@ -94,10 +94,11 @@ dependencies:
|
|
94
94
|
- - ">="
|
95
95
|
- !ruby/object:Gem::Version
|
96
96
|
version: '0'
|
97
|
-
description:
|
98
|
-
|
99
|
-
|
100
|
-
to
|
97
|
+
description: Ruby Gem for convenient reading and writing of CSV files. It has intelligent
|
98
|
+
defaults, and auto-discovery of column and row separators. It imports CSV Files
|
99
|
+
as Array(s) of Hashes, suitable for direct processing with ActiveRecord, kicking-off
|
100
|
+
batch jobs with Sidekiq, parallel processing, or oploading data to S3. Similarly,
|
101
|
+
writing CSV files takes Hashes, or Arrays of Hashes to create a CSV file.
|
101
102
|
email:
|
102
103
|
- tilo.sloboda@gmail.com
|
103
104
|
executables: []
|
@@ -122,7 +123,6 @@ files:
|
|
122
123
|
- docs/examples.md
|
123
124
|
- docs/header_transformations.md
|
124
125
|
- docs/header_validations.md
|
125
|
-
- docs/notes.md
|
126
126
|
- docs/options.md
|
127
127
|
- docs/row_col_sep.md
|
128
128
|
- docs/value_converters.md
|
@@ -161,9 +161,9 @@ required_ruby_version: !ruby/object:Gem::Requirement
|
|
161
161
|
version: 2.5.0
|
162
162
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
163
163
|
requirements:
|
164
|
-
- - "
|
164
|
+
- - ">="
|
165
165
|
- !ruby/object:Gem::Version
|
166
|
-
version:
|
166
|
+
version: '0'
|
167
167
|
requirements: []
|
168
168
|
rubygems_version: 3.2.3
|
169
169
|
signing_key:
|
data/docs/notes.md
DELETED
@@ -1,29 +0,0 @@
|
|
1
|
-
|
2
|
-
# Notes
|
3
|
-
|
4
|
-
|
5
|
-
|
6
|
-
|
7
|
-
## NOTES on the use of Chunking and Blocks:
|
8
|
-
* chunking can be VERY USEFUL if used in combination with passing a block to File.read_csv FOR LARGE FILES
|
9
|
-
* if you pass a block to File.read_csv, that block will be executed and given an Array of Hashes as the parameter.
|
10
|
-
* if the chunk_size is not set, then the array will only contain one Hash.
|
11
|
-
* if the chunk_size is > 0 , then the array may contain up to chunk_size Hashes.
|
12
|
-
* this can be very useful when passing chunked data to a post-processing step, e.g. through Sidekiq
|
13
|
-
|
14
|
-
## NOTES about File Encodings:
|
15
|
-
* if you have a CSV file which contains unicode characters, you can process it as follows:
|
16
|
-
|
17
|
-
```ruby
|
18
|
-
File.open(filename, "r:bom|utf-8") do |f|
|
19
|
-
data = SmarterCSV.process(f);
|
20
|
-
end
|
21
|
-
```
|
22
|
-
* if the CSV file with unicode characters is in a remote location, similarly you need to give the encoding as an option to the `open` call:
|
23
|
-
```ruby
|
24
|
-
require 'open-uri'
|
25
|
-
file_location = 'http://your.remote.org/sample.csv'
|
26
|
-
open(file_location, 'r:utf-8') do |f| # don't forget to specify the UTF-8 encoding!!
|
27
|
-
data = SmarterCSV.process(f)
|
28
|
-
end
|
29
|
-
```
|