csv_party 0.0.1.pre9 → 1.0.0.rc4
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/LICENSE.md +21 -0
- data/README.md +218 -0
- data/ROADMAP.md +271 -0
- data/lib/csv_party.rb +45 -275
- data/lib/csv_party/configuration.rb +82 -0
- data/lib/csv_party/data_preparer.rb +45 -0
- data/lib/csv_party/dsl.rb +38 -0
- data/lib/csv_party/errors.rb +157 -0
- data/lib/csv_party/parsers.rb +71 -0
- data/lib/csv_party/row.rb +83 -0
- data/lib/csv_party/runner.rb +219 -0
- data/lib/csv_party/testing.rb +6 -0
- metadata +14 -3
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 67d5895445a9fe397df95275260491ed6d9f6ce5
|
4
|
+
data.tar.gz: 542a1442466867afa33cf0ef883a32443777cfd5
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 1035dac76f5ec71d97015a5d9c2874074c28ad4b0ff7994ab565735fd6d49e539e4292a42ee15f44d8d2de2fb356e81a06f61685019fa3ad250d12ab9c160ba8
|
7
|
+
data.tar.gz: 93c748026adc2fa907f3e08825c0fef8b41fed5b2945e3066aff70978a86ed4c9d47df3a98abc30e25b2fb44a597cddb91efbb519699d2a9e3b4c580b4ef1100
|
data/LICENSE.md
ADDED
@@ -0,0 +1,21 @@
|
|
1
|
+
MIT License
|
2
|
+
|
3
|
+
Copyright (c) 2018 Richard A. Jones
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
7
|
+
in the Software without restriction, including without limitation the rights
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
10
|
+
furnished to do so, subject to the following conditions:
|
11
|
+
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
13
|
+
copies or substantial portions of the Software.
|
14
|
+
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
21
|
+
SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,218 @@
|
|
1
|
+
[![Gem Version](https://badge.fury.io/rb/csv_party.svg)](https://badge.fury.io/rb/csv_party)
|
2
|
+
[![Build Status](https://travis-ci.org/toasterlovin/csv_party.svg?branch=master)](https://travis-ci.org/toasterlovin/csv_party)
|
3
|
+
[![Code Climate Maintainability](https://api.codeclimate.com/v1/badges/946d0dec172fda05d631/maintainability)](https://codeclimate.com/github/toasterlovin/csv_party/maintainability)
|
4
|
+
[![Code Climate Test Coverage](https://api.codeclimate.com/v1/badges/946d0dec172fda05d631/test_coverage)](https://codeclimate.com/github/toasterlovin/csv_party/test_coverage)
|
5
|
+
|
6
|
+
# Make importing CSV files a party
|
7
|
+
|
8
|
+
The point of this gem is to make it easier to focus on the business
|
9
|
+
logic of your CSV imports. You start by defining which columns you
|
10
|
+
will be importing, as well as how they will be parsed. Then, you
|
11
|
+
specify what you want to do with each row after it has been parsed.
|
12
|
+
That's it; CSVParty takes care of all the tedious stuff for you.
|
13
|
+
|
14
|
+
## Defining Columns
|
15
|
+
|
16
|
+
This is what defining your import columns look like:
|
17
|
+
|
18
|
+
class MyImporter < CSVParty
|
19
|
+
column :price, header: 'Nonsensical Column Name', as: :decimal
|
20
|
+
end
|
21
|
+
|
22
|
+
This will take the value in the 'Nonsensical Column Name' column,
|
23
|
+
parse it as a decimal, then make it available to your import logic
|
24
|
+
as a nice, sane variable named `price`.
|
25
|
+
|
26
|
+
The available built-in parsers are:
|
27
|
+
|
28
|
+
- `:raw` returns the value from the CSV file, unchanged
|
29
|
+
- `:string` strips whitespace and returns the resulting string
|
30
|
+
- `:integer` strips whitespace, then calls `to_i` on the resulting string
|
31
|
+
- `:decimal` strips all characters except `0-9` and `.`, then passes the
|
32
|
+
resulting string to `BigDecimal.new`
|
33
|
+
- `:boolean` strips whitespace, downcases, then returns `true` if the
|
34
|
+
resulting string is `'1'`, `'t'`, or `'true'`, otherwise it returns `false`
|
35
|
+
|
36
|
+
When defining a column, you can also pass a block if you need custom
|
37
|
+
parsing logic:
|
38
|
+
|
39
|
+
class MyImporter < CSVParty
|
40
|
+
column :product, header: 'Product' do |value|
|
41
|
+
Product.find_by(name: value)
|
42
|
+
end
|
43
|
+
end
|
44
|
+
|
45
|
+
Or, if you want to re-use a custom parser for multiple columns, just
|
46
|
+
define a method on your class with a name that ends in `_parser` and
|
47
|
+
you can use it the same way you use the built-in parsers:
|
48
|
+
|
49
|
+
class MyImporter < CSVParty
|
50
|
+
def dollars_to_cents_parser(value)
|
51
|
+
(BigDecimal.new(value) * 100).to_i
|
52
|
+
end
|
53
|
+
|
54
|
+
column :price_in_cents, header: 'Price in $', as: :dollars_to_cents
|
55
|
+
column :cost_in_cents, header: 'Cost in $', as: :dollars_to_cents
|
56
|
+
end
|
57
|
+
|
58
|
+
#### NOTE: Parsing nil and blank values
|
59
|
+
|
60
|
+
By default, CSVParty will intercept any values that are `nil` or which contain
|
61
|
+
only whitespace and coerce them to `nil` _without invoking the parser for that
|
62
|
+
column_. This applies to all parsers, including custom parsers which you
|
63
|
+
define, with one exception: the :raw parser. This is done as a convenience to
|
64
|
+
avoid pesky `NoMethodError`s that arise when a parser tries to do its thing
|
65
|
+
to a `nil` value that it wasn't expecting. You can turn this behavior off on a
|
66
|
+
given column by setting `intercept_blanks` to `false` in the options hash:
|
67
|
+
|
68
|
+
class MyImporter < CSVParty
|
69
|
+
column :price, header: 'Price', intercept_blanks: false do |value|
|
70
|
+
if value.nil?
|
71
|
+
'n/a'
|
72
|
+
else
|
73
|
+
BigDecimal.new(value)
|
74
|
+
end
|
75
|
+
end
|
76
|
+
end
|
77
|
+
|
78
|
+
#### NOTE: Parsers cannot reference each other
|
79
|
+
|
80
|
+
When using a custom parser to parse a column, the block or method that you
|
81
|
+
define has no way to reference the values from any other columns. So, this won't
|
82
|
+
work:
|
83
|
+
|
84
|
+
class MyImporter < CSVParty
|
85
|
+
column :product, header: 'Product', do |value|
|
86
|
+
Product.find_by(name: value)
|
87
|
+
end
|
88
|
+
|
89
|
+
column :price, header: 'Price', do |value|
|
90
|
+
# product is not defined...
|
91
|
+
product.price = BigDecimal.new(value)
|
92
|
+
end
|
93
|
+
end
|
94
|
+
|
95
|
+
Instead, you would do this in your row import logic. Which brings us to:
|
96
|
+
|
97
|
+
## Importing Rows
|
98
|
+
|
99
|
+
Once you've defined all of your columns, you specify your logic for importing
|
100
|
+
rows by passing a block to the `rows` DSL method. That block will have access
|
101
|
+
to a `row` variable which contains all of the parsed values for your columns.
|
102
|
+
Here's what that looks like:
|
103
|
+
|
104
|
+
class MyImporter < CSVParty
|
105
|
+
rows do |row|
|
106
|
+
product = row.product
|
107
|
+
product.price = row.price
|
108
|
+
product.save
|
109
|
+
end
|
110
|
+
end
|
111
|
+
|
112
|
+
The `row` variable also provides access to two other things:
|
113
|
+
|
114
|
+
- The unparsed values for your columns
|
115
|
+
- The raw CSV string for that row
|
116
|
+
|
117
|
+
Here's how you access those:
|
118
|
+
|
119
|
+
class MyImporter < CSVParty
|
120
|
+
rows do |row|
|
121
|
+
row.price # parsed value: #<BigDecimal:7f88d92cb820,'0.9E1',9(18)>
|
122
|
+
row.unparsed.price # unparsed value: '$9.00'
|
123
|
+
row.string # raw CSV string: 'USB Cable,$9.00,Box,Blue'
|
124
|
+
end
|
125
|
+
end
|
126
|
+
|
127
|
+
## Importing
|
128
|
+
|
129
|
+
Once your importer class is defined, you use it like this:
|
130
|
+
|
131
|
+
importer = MyImporter.new('path/to/file.csv')
|
132
|
+
importer.import!
|
133
|
+
|
134
|
+
You can also specify what should happen before and after your import by passing
|
135
|
+
a block to `import`, like so:
|
136
|
+
|
137
|
+
class MyImporter < CSVParty
|
138
|
+
# column definitions
|
139
|
+
# row import logic
|
140
|
+
|
141
|
+
import do
|
142
|
+
puts 'Starting import'
|
143
|
+
import_rows!
|
144
|
+
puts 'Import finished!'
|
145
|
+
end
|
146
|
+
end
|
147
|
+
|
148
|
+
You can do whatever you want inside of the `import` block, just make sure to
|
149
|
+
call `import_rows!` somewhere in there.
|
150
|
+
|
151
|
+
## Handling Errors
|
152
|
+
|
153
|
+
One of the hallmarks of importing data from CSV files is that there are
|
154
|
+
inevitably rows with errors of some kind. You can handle error rows by
|
155
|
+
specifying an `errors` block:
|
156
|
+
|
157
|
+
class MyImporter < CSVParty
|
158
|
+
# column definitions
|
159
|
+
# row import logic
|
160
|
+
|
161
|
+
errors do |error, line_number|
|
162
|
+
# log error
|
163
|
+
end
|
164
|
+
end
|
165
|
+
|
166
|
+
Any row in your CSV file which results in an exception will be passed to this
|
167
|
+
block. Which means you can specify that there is an error with a given row by
|
168
|
+
raising an exception:
|
169
|
+
|
170
|
+
rows do |row|
|
171
|
+
# rows with price less than 0 will be treated as errors
|
172
|
+
raise if row.price < 0
|
173
|
+
end
|
174
|
+
|
175
|
+
## External Dependencies
|
176
|
+
|
177
|
+
Sometimes you need access to external objects in your importer's logic. You can specify
|
178
|
+
what external objects your importer depends on with `depends_on`. Dependencies declared
|
179
|
+
this way will then be available in your parsers and your `rows`, `import`, and `errors`
|
180
|
+
blocks:
|
181
|
+
|
182
|
+
class MyImporter < CSVParty
|
183
|
+
# column definitions...
|
184
|
+
|
185
|
+
depends_on: :product_import
|
186
|
+
|
187
|
+
rows do |row|
|
188
|
+
# do some stuff
|
189
|
+
|
190
|
+
# product_import is not provided by the class,
|
191
|
+
# but is passed in at runtime instead!
|
192
|
+
product_import.log_success(product)
|
193
|
+
end
|
194
|
+
end
|
195
|
+
|
196
|
+
Then, to pass the dependency in at runtime, you just add an option to `.new` with
|
197
|
+
the name and value of the dependency:
|
198
|
+
|
199
|
+
MyImporter.new(
|
200
|
+
'path/to/csv',
|
201
|
+
product_import: @product_import
|
202
|
+
)
|
203
|
+
|
204
|
+
# Tested Rubies
|
205
|
+
|
206
|
+
CSVParty has been tested against the following Rubies:
|
207
|
+
|
208
|
+
MRI
|
209
|
+
- 2.5
|
210
|
+
- 2.4
|
211
|
+
- 2.3
|
212
|
+
- 2.2
|
213
|
+
- 2.1
|
214
|
+
- 2.0
|
215
|
+
|
216
|
+
# License
|
217
|
+
|
218
|
+
This project uses the MIT License. See LICENSE.md for details.
|
data/ROADMAP.md
ADDED
@@ -0,0 +1,271 @@
|
|
1
|
+
Roadmap
|
2
|
+
-
|
3
|
+
|
4
|
+
- [1.1 Early Return While Parsing](#11-early-return-while-parsing)
|
5
|
+
- [1.2 Rows to Hash](#12-rows-to-hash)
|
6
|
+
- [1.3 Generate Unimported Rows CSV](#13-generate-unimported-rows-csv)
|
7
|
+
- [1.4 Batch API](#14-batch-api)
|
8
|
+
- [1.5 Runtime Configuration](#15-runtime-configuration)
|
9
|
+
- [1.6 CSV Parse Error Handling](#16-csv-parse-error-handling)
|
10
|
+
- [Someday Features](#someday-features)
|
11
|
+
- [Column Numbers](#column-numbers)
|
12
|
+
- [Multi-column Parsing](#multi-column-parsing)
|
13
|
+
- [Parse Dependencies](#parse-dependencies)
|
14
|
+
|
15
|
+
#### 1.1 Early Return While Parsing
|
16
|
+
|
17
|
+
Currently, CSVParty is pretty well thought out about what should happen when
|
18
|
+
either 1) one of the built in flow control methods (`next_row`, `skip_row`,
|
19
|
+
`abort_row`, and `abort_import`) is used, or 2) an error is raised while
|
20
|
+
the row importer block is being executed. However, all of these things can also
|
21
|
+
happen when the columns for a row are being parsed. When/if it does, most of the
|
22
|
+
flow control and error handling kind of assumes that the row has been fully
|
23
|
+
parsed. So some design work should go into deciding what should happen in these
|
24
|
+
cases. And then tests should be written for all of the various scenarios.
|
25
|
+
|
26
|
+
#### 1.2 Rows to Hash
|
27
|
+
|
28
|
+
One of the primary use cases for importing CSV files is to insert their contents
|
29
|
+
into a database. Apparently this is common enough that the
|
30
|
+
[csv-importer](https://github.com/pcreux/csv-importer) gem, which almost
|
31
|
+
completely automates this process without much room for customization, is very
|
32
|
+
popular. So, in the case where there is a pretty simple correspondence between
|
33
|
+
the contents of a CSV file and ActiveRecord models, it should be dead simple to
|
34
|
+
get the job done.
|
35
|
+
|
36
|
+
What I have in mind is something like:
|
37
|
+
|
38
|
+
class MyImporter < CSVParty::Importer
|
39
|
+
column :product_id
|
40
|
+
column :quantity
|
41
|
+
column :price
|
42
|
+
|
43
|
+
rows do |row|
|
44
|
+
LineItem.create(row.attributes)
|
45
|
+
end
|
46
|
+
end
|
47
|
+
|
48
|
+
Where `row.attributes` returns a hash with all of the column names as keys and
|
49
|
+
all of the parsed values as values. So, with an importer like the one above,
|
50
|
+
`row.attributes` would return a hash like so:
|
51
|
+
|
52
|
+
{ product_id: 42, quantity: 3, price: 9.99 }
|
53
|
+
|
54
|
+
#### 1.3 Generate Unimported Rows CSV
|
55
|
+
|
56
|
+
Most user inputs to an application are relatively constrained. CSV files, on the
|
57
|
+
other hand, are not. Users can, and will, put all kinds of erroneous data into
|
58
|
+
their CSV files. So, it is useful to be able to provide a user with a list of
|
59
|
+
the rows in their file that could not be imported, so that they can re-import
|
60
|
+
these rows after they have resolved whatever issues existed. And CSV is a
|
61
|
+
natural format for this, since the user can open the file in Excel and make
|
62
|
+
edits.
|
63
|
+
|
64
|
+
A motivated user of CSVParty can already achieve this by accessing the
|
65
|
+
`skipped_rows`, `aborted_rows`, and `error_rows` arrays and constructing one or
|
66
|
+
more CSV files from these, but it would be nice to provide a default
|
67
|
+
implementation that is only a method call away. What I have in mind is for the
|
68
|
+
CSV file that is created to have the exact same column structure as the original
|
69
|
+
file, but with three additional columns:
|
70
|
+
|
71
|
+
- The original row number
|
72
|
+
- The status (skipped, aborted, errored)
|
73
|
+
- A message explaining the reason for the status
|
74
|
+
|
75
|
+
Conveniently, all of these pieces of data are available for skipped, aborted,
|
76
|
+
and errored rows. Then, the file would be generated with a method, like so:
|
77
|
+
|
78
|
+
# all three combined
|
79
|
+
importer.unimported_rows_as_csv
|
80
|
+
# or separate
|
81
|
+
importer.skipped_rows_as_csv
|
82
|
+
importer.aborted_rows_as_csv
|
83
|
+
importer.error_rows_as_csv
|
84
|
+
|
85
|
+
#### 1.4 Batch API
|
86
|
+
|
87
|
+
It can be way more performant to batch imports so that expensive operations,
|
88
|
+
like persisting data, are only done every so often. This would add an API to
|
89
|
+
accumulate data, execute some logic every X number of rows, reset the
|
90
|
+
accumulators, then repeat. Here's a rough sketch of what that API might look
|
91
|
+
like:
|
92
|
+
|
93
|
+
rows do |row|
|
94
|
+
customers[row.customer_id] = { name: row.customer_name, phone: row.phone }
|
95
|
+
orders[row.order_id] = { customer_id: row.customer_id, invoice_number: row.invoice_number }
|
96
|
+
end
|
97
|
+
|
98
|
+
batch 50, customers: {}, orders: {} do
|
99
|
+
# insert customers into database
|
100
|
+
# insert orders into database
|
101
|
+
end
|
102
|
+
|
103
|
+
The first argument is how often the batch logic should be executed. In this
|
104
|
+
case, every 50 rows. Then there is a hash of accumulators, where the keys are
|
105
|
+
the names of the accumulators and the values are the initial values. Declaring
|
106
|
+
the accumulators accomplished two things:
|
107
|
+
|
108
|
+
1. It provides accessor methods so that the accumulators can be accessed from
|
109
|
+
within the row import block.
|
110
|
+
2. It automatically resets the accumulators to their initial values each time
|
111
|
+
the batch block is executed.
|
112
|
+
|
113
|
+
So, it is essentially functionally identical to doing the following:
|
114
|
+
|
115
|
+
class MyImporter < CSVParty::Importer
|
116
|
+
attr_accessor :customers, :orders
|
117
|
+
|
118
|
+
def customers
|
119
|
+
@customers ||= {}
|
120
|
+
end
|
121
|
+
|
122
|
+
def orders
|
123
|
+
@orders ||= {}
|
124
|
+
end
|
125
|
+
|
126
|
+
rows do |row|
|
127
|
+
# add customer to customers accumulator
|
128
|
+
# add order to orders accumulator
|
129
|
+
end
|
130
|
+
|
131
|
+
batch 50 do
|
132
|
+
# insert customers into database
|
133
|
+
# insert orders into database
|
134
|
+
customers = {}
|
135
|
+
orders = {}
|
136
|
+
end
|
137
|
+
end
|
138
|
+
|
139
|
+
_Note:_ The following is a rough sketch of an API that would handle a use case
|
140
|
+
that has come up. However, some research should be done first to figure out if
|
141
|
+
the use case it addresses is common.
|
142
|
+
|
143
|
+
One use case that has been mentioned is when rows are grouped by their
|
144
|
+
relationship to a parent record and those rows need to be acted on as a group.
|
145
|
+
So, imagine a CSV file like so:
|
146
|
+
|
147
|
+
Customer,Address,Product,Quantity,Price
|
148
|
+
Joe Smith,123 Main St.,Birkenstocks,1,74.99
|
149
|
+
Joe Smith,123 Main St.,Air Jordans,1,129.99
|
150
|
+
Joe Smith,123 Main St.,Tevas,3,59.99
|
151
|
+
Jane Doe,713 Broadway,Converse All-Star,1,39.99
|
152
|
+
Jane Doe,713 Broadway,Toms,1,59.99
|
153
|
+
|
154
|
+
It might be useful to be able to specify the batch interval in terms of one of
|
155
|
+
the columns in the CSV file, rather than as a number of rows. So, you would be
|
156
|
+
able to do:
|
157
|
+
|
158
|
+
class MyImporter < CSVParty::Importer
|
159
|
+
column :customer
|
160
|
+
column :address
|
161
|
+
column :product
|
162
|
+
column :quantity, as: :integer
|
163
|
+
column :price, as: :decimal
|
164
|
+
|
165
|
+
rows do |row|
|
166
|
+
line_items << { product: row.product, quantity: row.quantity, price: row.price }
|
167
|
+
end
|
168
|
+
|
169
|
+
batch :customer, line_items: [] do |current_row|
|
170
|
+
Customer.create(name: current_row.customer, address: current_row.address)
|
171
|
+
line_items.each do |li|
|
172
|
+
LineItem.create(li)
|
173
|
+
end
|
174
|
+
end
|
175
|
+
end
|
176
|
+
|
177
|
+
In this case, the batch logic gets executed everytime there is a change in the
|
178
|
+
`:customer` column from one row to the next, rather than every X number of rows.
|
179
|
+
The accumulator works the same way: accessors are made available for adding
|
180
|
+
records to the accumulator and then the accumulator is automatically reset to
|
181
|
+
its initial value each time the batch logic is executed.
|
182
|
+
|
183
|
+
#### 1.5 Runtime Configuration
|
184
|
+
|
185
|
+
Sometimes it useful to be able to configure an importer at runtime, rather than
|
186
|
+
at code writing time. An obvious example of when this would be useful is in the
|
187
|
+
case of user defined column header names. So, imagine a UI in which the user
|
188
|
+
uploads their CSV file, then specifies which column is, for example, the product
|
189
|
+
column, which is the quantity column, and which is the price column. In a case
|
190
|
+
like this, there is no way to specify the column definitions ahead of time; we
|
191
|
+
have to wait for the header names from the user.
|
192
|
+
|
193
|
+
Here is a sketch of what the API for runtime configuration would look like:
|
194
|
+
|
195
|
+
class MyImporter < CSVParty::Importer
|
196
|
+
rows do |row|
|
197
|
+
# persist data
|
198
|
+
end
|
199
|
+
end
|
200
|
+
|
201
|
+
# then:
|
202
|
+
|
203
|
+
my_importer = MyImporter.new
|
204
|
+
my_importer.configure do
|
205
|
+
column :product, header: user_product_header
|
206
|
+
column :quantity, header: user_quantity_header, as: :integer
|
207
|
+
column :price, header: user_price_header, as: :decimal
|
208
|
+
end
|
209
|
+
|
210
|
+
An open question is whether all DSL methods should be configurable at runtime.
|
211
|
+
|
212
|
+
#### 1.6 CSV Parse Error Handling
|
213
|
+
|
214
|
+
Sometimes it is useful to be able to completely ignore parsing and encoding
|
215
|
+
errors raised by the `CSV` class. To be clear, doing so is dangerous, since the
|
216
|
+
parsing logic in the `CSV` class is not designed to continue operating after it
|
217
|
+
encounters an error and raises. But sometimes you don't want to let a single
|
218
|
+
improperly encoded character prevent you from importing an entire CSV file. So,
|
219
|
+
this feature would be an optional way to either ignore those errors or respond
|
220
|
+
to them, and then continue importing. The API would probably be similar to the
|
221
|
+
error handling API for non-parse errors. So:
|
222
|
+
|
223
|
+
parse_errors :ignore # silently continue importing the next row
|
224
|
+
|
225
|
+
parse_errors do |line_number|
|
226
|
+
# handle parse error
|
227
|
+
end
|
228
|
+
|
229
|
+
my_import.parse_error_rows # returns array of parse error rows
|
230
|
+
|
231
|
+
## Someday Features
|
232
|
+
|
233
|
+
#### Column Numbers
|
234
|
+
|
235
|
+
CSVParty is entirely oriented around a CSV file having a header. This is not
|
236
|
+
always the case, though. This would add the ability to specify columns using a
|
237
|
+
column number, rather than a header. A rough sketch of the API might look like:
|
238
|
+
|
239
|
+
class MyImporter < CSVParty::Importer
|
240
|
+
column :product, number: 7
|
241
|
+
column :quantity, number: 8, as: :integer
|
242
|
+
column :price, number: 9, as: :decimal
|
243
|
+
end
|
244
|
+
|
245
|
+
#### Multi-column Parsing
|
246
|
+
|
247
|
+
The whole idea behind custom parsers is that it makes for much cleaner code to
|
248
|
+
get all the logic related to parsing a raw value into a useful intermediate
|
249
|
+
object in one place, away from the larger logic of what needs to happen to each
|
250
|
+
row. Sometimes, though, you need access to multiple column values to create a
|
251
|
+
useful parsed value. Here is what an API for that might look like:
|
252
|
+
|
253
|
+
column :total, header: ['Price', 'Quantity'] do |price, quantity|
|
254
|
+
BigDecimal.new(price) * BigDecimal.new(quantity)
|
255
|
+
end
|
256
|
+
|
257
|
+
#### Parse Dependencies
|
258
|
+
|
259
|
+
Sometimes, while parsing a column, it would be useful to have access to the
|
260
|
+
parsed value from another column. This would make that possible. Here is what
|
261
|
+
that might look like:
|
262
|
+
|
263
|
+
class MyImporter < CSVParty::Importer
|
264
|
+
column :customer do |customer_id|
|
265
|
+
Customer.find(customer_id)
|
266
|
+
end
|
267
|
+
|
268
|
+
column :order, depends_on: :customer do |order_id, customer|
|
269
|
+
customer.orders.find(order_id)
|
270
|
+
end
|
271
|
+
end
|