smarter_csv 1.15.2 → 1.16.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (48) hide show
  1. checksums.yaml +4 -4
  2. data/.rubocop.yml +9 -0
  3. data/CHANGELOG.md +68 -1
  4. data/CONTRIBUTORS.md +3 -1
  5. data/Gemfile +1 -0
  6. data/README.md +123 -27
  7. data/docs/_introduction.md +40 -24
  8. data/docs/bad_row_quarantine.md +285 -0
  9. data/docs/basic_read_api.md +151 -9
  10. data/docs/basic_write_api.md +474 -59
  11. data/docs/batch_processing.md +161 -4
  12. data/docs/column_selection.md +183 -0
  13. data/docs/data_transformations.md +162 -29
  14. data/docs/examples.md +339 -46
  15. data/docs/header_transformations.md +93 -12
  16. data/docs/header_validations.md +56 -18
  17. data/docs/history.md +117 -0
  18. data/docs/instrumentation.md +165 -0
  19. data/docs/migrating_from_csv.md +290 -0
  20. data/docs/options.md +150 -87
  21. data/docs/parsing_strategy.md +63 -1
  22. data/docs/real_world_csv.md +262 -0
  23. data/docs/releases/1.16.0/benchmarks.md +223 -0
  24. data/docs/releases/1.16.0/changes.md +272 -0
  25. data/docs/releases/1.16.0/performance_notes.md +114 -0
  26. data/docs/row_col_sep.md +14 -5
  27. data/docs/value_converters.md +193 -57
  28. data/ext/smarter_csv/extconf.rb +3 -0
  29. data/ext/smarter_csv/smarter_csv.c +1007 -71
  30. data/images/SmarterCSV_1.16.0_vs_RubyCSV_3.3.5_speedup.png +0 -0
  31. data/images/SmarterCSV_1.16.0_vs_RubyCSV_3.3.5_speedup.svg +108 -0
  32. data/images/SmarterCSV_1.16.0_vs_previous_C-speedup.png +0 -0
  33. data/images/SmarterCSV_1.16.0_vs_previous_C-speedup.svg +141 -0
  34. data/images/SmarterCSV_1.16.0_vs_previous_Rb-speedup.png +0 -0
  35. data/images/SmarterCSV_1.16.0_vs_previous_Rb-speedup.svg +139 -0
  36. data/lib/smarter_csv/errors.rb +8 -0
  37. data/lib/smarter_csv/file_io.rb +1 -1
  38. data/lib/smarter_csv/hash_transformations.rb +14 -13
  39. data/lib/smarter_csv/header_transformations.rb +21 -2
  40. data/lib/smarter_csv/headers.rb +2 -1
  41. data/lib/smarter_csv/options.rb +124 -7
  42. data/lib/smarter_csv/parser.rb +362 -75
  43. data/lib/smarter_csv/reader.rb +494 -46
  44. data/lib/smarter_csv/version.rb +1 -1
  45. data/lib/smarter_csv/writer.rb +71 -19
  46. data/lib/smarter_csv.rb +95 -12
  47. data/smarter_csv.gemspec +20 -10
  48. metadata +37 -80
@@ -0,0 +1,285 @@
1
+
2
+ ### Contents
3
+
4
+ * [Introduction](./_introduction.md)
5
+ * [Migrating from Ruby CSV](./migrating_from_csv.md)
6
+ * [Parsing Strategy](./parsing_strategy.md)
7
+ * [The Basic Read API](./basic_read_api.md)
8
+ * [The Basic Write API](./basic_write_api.md)
9
+ * [Batch Processing](././batch_processing.md)
10
+ * [Configuration Options](./options.md)
11
+ * [Row and Column Separators](./row_col_sep.md)
12
+ * [Header Transformations](./header_transformations.md)
13
+ * [Header Validations](./header_validations.md)
14
+ * [Column Selection](./column_selection.md)
15
+ * [Data Transformations](./data_transformations.md)
16
+ * [Value Converters](./value_converters.md)
17
+ * [**Bad Row Quarantine**](./bad_row_quarantine.md)
18
+ * [Instrumentation Hooks](./instrumentation.md)
19
+ * [Examples](./examples.md)
20
+ * [Real-World CSV Files](./real_world_csv.md)
21
+ * [SmarterCSV over the Years](./history.md)
22
+ * [Release Notes](./releases/1.16.0/changes.md)
23
+
24
+ --------------
25
+
26
+ # Bad Row Quarantine
27
+
28
+ Real-world CSV files are often malformed. By default, SmarterCSV raises an exception on the
29
+ first bad row it encounters. The `on_bad_row` option lets you keep processing and handle bad
30
+ rows in whatever way suits your application.
31
+
32
+ ## What counts as a bad row
33
+
34
+ - Malformed CSV (unclosed quoted fields, unterminated multiline rows)
35
+ - A field that exceeds `field_size_limit` (see [Limiting field size](#limiting-field-size-field_size_limit))
36
+ - Extra columns when running in `strict: true` mode
37
+ - Any `SmarterCSV::Error` or `EOFError` raised during row parsing
38
+
39
+ ## Options
40
+
41
+ | Option | Default | Description |
42
+ |--------|---------|-------------|
43
+ | `on_bad_row` | `:raise` | How to handle a bad row: `:raise`, `:skip`, `:collect`, or a callable |
44
+ | `collect_raw_lines` | `true` | Include `raw_logical_line` in the error record |
45
+ | `bad_row_limit` | `nil` | Raise `SmarterCSV::TooManyBadRows` after this many bad rows |
46
+
47
+ ## Modes
48
+
49
+ ### `:raise` (default)
50
+
51
+ Current behavior — the exception propagates and processing stops:
52
+
53
+ ```ruby
54
+ SmarterCSV.process('data.csv')
55
+ # => raises SmarterCSV::MalformedCSV on the first bad row
56
+ ```
57
+
58
+ ### `:skip`
59
+
60
+ Silently skip bad rows and continue. The count of skipped rows is available on
61
+ `reader.errors[:bad_row_count]`. No error records are stored.
62
+
63
+ ```ruby
64
+ reader = SmarterCSV::Reader.new('data.csv', on_bad_row: :skip)
65
+ result = reader.process
66
+
67
+ puts "Processed: #{result.size} good rows"
68
+ puts "Skipped: #{reader.errors[:bad_row_count] || 0} bad rows"
69
+ ```
70
+
71
+ ### `:collect`
72
+
73
+ Continue processing and store a structured error record for each bad row in
74
+ `reader.errors[:bad_rows]`. Requires using `SmarterCSV::Reader` directly (the
75
+ `SmarterCSV.process` convenience method discards the reader instance and cannot
76
+ return the collected errors).
77
+
78
+ ```ruby
79
+ reader = SmarterCSV::Reader.new('data.csv', on_bad_row: :collect)
80
+ result = reader.process
81
+
82
+ result.each { |row| MyModel.create!(row) }
83
+
84
+ reader.errors[:bad_rows].each do |rec|
85
+ Rails.logger.warn "Bad row at line #{rec[:csv_line_number]}: #{rec[:error_message]}"
86
+ Rails.logger.warn "Raw content: #{rec[:raw_logical_line]}"
87
+ end
88
+ ```
89
+
90
+ ### Callable (lambda / proc)
91
+
92
+ Pass any object that responds to `#call`. It is invoked once per bad row with the
93
+ error record hash, then processing continues. Useful for streaming errors to a
94
+ dead-letter queue, a metrics system, or a separate file.
95
+
96
+ ```ruby
97
+ # Log to a dead-letter file
98
+ quarantine = File.open('quarantine.csv', 'w')
99
+
100
+ reader = SmarterCSV::Reader.new('data.csv',
101
+ on_bad_row: ->(rec) { quarantine.puts(rec[:raw_logical_line]) }
102
+ )
103
+ reader.process
104
+ quarantine.close
105
+ ```
106
+
107
+ ```ruby
108
+ # Send to a monitoring system
109
+ reader = SmarterCSV::Reader.new('data.csv',
110
+ on_bad_row: ->(rec) { Metrics.increment('csv.bad_rows', tags: { error: rec[:error_class].name }) }
111
+ )
112
+ reader.process
113
+ ```
114
+
115
+ ```ruby
116
+ # Collect into your own structure
117
+ errors = []
118
+ reader = SmarterCSV::Reader.new('data.csv',
119
+ on_bad_row: ->(rec) { errors << rec }
120
+ )
121
+ result = reader.process
122
+ ```
123
+
124
+ ## Error record structure
125
+
126
+ Each error record is a Hash:
127
+
128
+ ```ruby
129
+ {
130
+ csv_line_number: 3, # logical row (counting header as row 1)
131
+ file_line_number: 3, # physical file line where the row started
132
+ file_lines_consumed: 1, # physical lines spanned (>1 for multiline)
133
+ error_class: SmarterCSV::HeaderSizeMismatch, # exception class object
134
+ error_message: "extra columns detected ...", # exception message string
135
+ raw_logical_line: "Jane,25,Boston,EXTRA_DATA\n", # present when collect_raw_lines: true (default)
136
+ }
137
+ ```
138
+
139
+ ### `collect_raw_lines`
140
+
141
+ `collect_raw_lines: true` (default) — `raw_logical_line` is always included in the error
142
+ record. Set to `false` if you want to reduce memory usage and don't need the raw content:
143
+
144
+ ```ruby
145
+ reader = SmarterCSV::Reader.new('data.csv',
146
+ on_bad_row: :collect,
147
+ collect_raw_lines: false,
148
+ )
149
+ ```
150
+
151
+ For multiline rows (quoted fields spanning several physical lines), `raw_logical_line` contains
152
+ the fully stitched content — it may include embedded newline characters. The
153
+ `file_lines_consumed` field tells you how many physical lines were read.
154
+
155
+ ## Limiting bad rows with `bad_row_limit`
156
+
157
+ To abort processing after too many failures, set `bad_row_limit`. This works with `:skip`,
158
+ `:collect`, and callable modes:
159
+
160
+ ```ruby
161
+ reader = SmarterCSV::Reader.new('data.csv',
162
+ on_bad_row: :collect,
163
+ bad_row_limit: 10,
164
+ )
165
+
166
+ begin
167
+ result = reader.process
168
+ rescue SmarterCSV::TooManyBadRows => e
169
+ puts "Aborting: #{e.message}"
170
+ puts "Collected so far: #{reader.errors[:bad_rows].size} bad rows"
171
+ end
172
+ ```
173
+
174
+ ## Accessing errors
175
+
176
+ Bad row data is stored on the `Reader` instance:
177
+
178
+ | Attribute | Description |
179
+ |-----------|-------------|
180
+ | `reader.errors[:bad_row_count]` | Total bad rows encountered (all modes) |
181
+ | `reader.errors[:bad_rows]` | Array of error records (`:collect` mode only) |
182
+
183
+ Note: `SmarterCSV.process` (the convenience method) discards the `Reader` instance after
184
+ returning. To access `reader.errors`, always instantiate `SmarterCSV::Reader` directly.
185
+
186
+ ## Chunked processing
187
+
188
+ Bad row quarantine works seamlessly with `chunk_size`. Skipped rows are simply not added to the
189
+ current chunk — chunk sizes remain consistent:
190
+
191
+ ```ruby
192
+ reader = SmarterCSV::Reader.new('large_file.csv',
193
+ chunk_size: 500,
194
+ on_bad_row: :collect,
195
+ )
196
+ reader.process do |chunk, index|
197
+ MyModel.import(chunk)
198
+ end
199
+ puts "Bad rows: #{reader.errors[:bad_row_count]}"
200
+ ```
201
+
202
+ ## Limiting field size: `field_size_limit`
203
+
204
+ Real-world CSV files sometimes contain unexpectedly large fields — either intentionally
205
+ (a DoS attempt) or accidentally (a forgotten closing quote, a JSON blob in a cell, a notes
206
+ field that ran away). Without a limit, SmarterCSV will happily stitch together physical lines
207
+ until it either finds the closing quote or reaches end-of-file, potentially consuming hundreds
208
+ of megabytes.
209
+
210
+ `field_size_limit` sets a hard cap (in bytes) on the size of any individual extracted field.
211
+ The default is `nil` (no limit). When a field exceeds the limit a
212
+ `SmarterCSV::FieldSizeLimitExceeded` exception is raised — and because it inherits from
213
+ `SmarterCSV::Error`, the `on_bad_row` option handles it exactly like any other parse error.
214
+
215
+ ### The three cases it prevents
216
+
217
+ **1. Huge inline field** — a single-line field containing a large payload (e.g. a JSON blob,
218
+ a base64-encoded file, or a runaway notes column):
219
+
220
+ ```csv
221
+ id,payload
222
+ 1,"{... 500 KB of JSON ...}"
223
+ ```
224
+
225
+ **2. Quoted field spanning many embedded newlines** — a legitimate multiline field in a
226
+ poorly exported file that happens to be enormous:
227
+
228
+ ```csv
229
+ ticket_id,notes
230
+ 42,"Customer wrote:
231
+ ... (thousands of lines of chat history) ...
232
+ "
233
+ ```
234
+
235
+ **3. Never-closing quoted field** — a missing closing quote causes the parser to stitch every
236
+ subsequent physical line into one logical row until EOF:
237
+
238
+ ```csv
239
+ id,comment
240
+ 1,"this quote never closes
241
+ 2,this entire row is now inside the field
242
+ 3,and this one too ...
243
+ ```
244
+
245
+ Without `field_size_limit`, case 3 reads the entire rest of the file into memory. With the
246
+ limit set, the stitch loop raises `FieldSizeLimitExceeded` as soon as the accumulating buffer
247
+ crosses the threshold.
248
+
249
+ ### Usage
250
+
251
+ ```ruby
252
+ # Raise immediately on any oversized field (default on_bad_row: :raise)
253
+ SmarterCSV.process('data.csv', field_size_limit: 1_000_000) # 1 MB per field
254
+
255
+ # Skip oversized rows and continue
256
+ SmarterCSV.process('data.csv', field_size_limit: 1_000_000, on_bad_row: :skip)
257
+
258
+ # Collect oversized rows for inspection
259
+ reader = SmarterCSV::Reader.new('data.csv',
260
+ field_size_limit: 1_000_000,
261
+ on_bad_row: :collect,
262
+ )
263
+ result = reader.process
264
+ reader.errors[:bad_rows].each do |rec|
265
+ Rails.logger.warn "Oversized field on row #{rec[:csv_line_number]}: #{rec[:error_message]}"
266
+ end
267
+ ```
268
+
269
+ ### What "bytes" means here
270
+
271
+ The limit is checked against `String#bytesize` (raw byte count), not character count.
272
+ For ASCII content they are identical. For multi-byte UTF-8 content (e.g. CJK characters)
273
+ bytesize is larger than the character count — so the limit is a memory cap, not a
274
+ character cap, which is what matters for DoS protection.
275
+
276
+ ### Performance
277
+
278
+ `field_size_limit` is zero-overhead when not set (the default `nil` short-circuits all
279
+ checks). When set, a single integer comparison is performed per logical row; the per-field
280
+ scan only runs when the raw line is large enough to potentially contain an oversized field.
281
+ Normal rows (where the entire line fits within the limit) bypass per-field checking entirely.
282
+
283
+ --------------------
284
+
285
+ PREVIOUS: [Value Converters](./value_converters.md) | NEXT: [Instrumentation Hooks](./instrumentation.md) | UP: [README](../README.md)
@@ -2,6 +2,7 @@
2
2
  ### Contents
3
3
 
4
4
  * [Introduction](./_introduction.md)
5
+ * [Migrating from Ruby CSV](./migrating_from_csv.md)
5
6
  * [Parsing Strategy](./parsing_strategy.md)
6
7
  * [**The Basic Read API**](./basic_read_api.md)
7
8
  * [The Basic Write API](./basic_write_api.md)
@@ -10,10 +11,17 @@
10
11
  * [Row and Column Separators](./row_col_sep.md)
11
12
  * [Header Transformations](./header_transformations.md)
12
13
  * [Header Validations](./header_validations.md)
14
+ * [Column Selection](./column_selection.md)
13
15
  * [Data Transformations](./data_transformations.md)
14
16
  * [Value Converters](./value_converters.md)
15
-
16
- --------------
17
+ * [Bad Row Quarantine](./bad_row_quarantine.md)
18
+ * [Instrumentation Hooks](./instrumentation.md)
19
+ * [Examples](./examples.md)
20
+ * [Real-World CSV Files](./real_world_csv.md)
21
+ * [SmarterCSV over the Years](./history.md)
22
+ * [Release Notes](./releases/1.16.0/changes.md)
23
+
24
+ --------------
17
25
 
18
26
  # SmarterCSV Basic API
19
27
 
@@ -22,7 +30,7 @@ Let's explore the basic APIs for reading and writing CSV files. There is a simpl
22
30
  ## Reading CSV
23
31
 
24
32
  SmarterCSV has convenient defaults for automatically detecting row and column separators based on the given data. This provides more robust parsing of input files when you have no control over the data, e.g. when users upload CSV files.
25
- Learn more about this [in this section](docs/examples/row_col_sep.md).
33
+ Learn more about this [in this section](./row_col_sep.md).
26
34
 
27
35
  ### Simplified Interface
28
36
 
@@ -32,11 +40,23 @@ The simplified call to read CSV files is:
32
40
  array_of_hashes = SmarterCSV.process(file_or_input, options)
33
41
 
34
42
  ```
43
+
44
+ To parse a CSV **string** directly (no file needed), use `SmarterCSV.parse`:
45
+
46
+ ```
47
+ array_of_hashes = SmarterCSV.parse(csv_string, options)
48
+
49
+ ```
50
+
51
+ This is equivalent to `SmarterCSV.process(StringIO.new(csv_string), options)` and is the
52
+ idiomatic replacement for `CSV.parse(str, headers: true, header_converters: :symbol)`.
53
+ See [Migrating from Ruby CSV](./migrating_from_csv.md) for a full comparison.
54
+
35
55
  It can also be used with a block. The block always receives an array of hashes and an optional chunk index:
36
56
 
37
57
  ```
38
58
  SmarterCSV.process(file_or_input, options) do |array_of_hashes|
39
- # without chunk_size, each yield conatins a one-element array (one row)
59
+ # without chunk_size, each yield contains a one-element array (one row)
40
60
  end
41
61
  ```
42
62
 
@@ -81,11 +101,133 @@ It can also be used with a block. The block always receives an array of hashes a
81
101
  This allows you access to the internal state of the `reader` instance after processing.
82
102
 
83
103
 
104
+ ## Modern Enumerator API — `each`
105
+
106
+ `Reader#each` is the modern, idiomatic way to read CSV rows one at a time. It always yields a single `Hash` per row and includes `Enumerable`, so every standard Ruby enumerable method works out of the box.
107
+
108
+ ### Simplified form
109
+
110
+ ```ruby
111
+ SmarterCSV.each('data.csv', options) do |hash|
112
+ MyModel.upsert(hash)
113
+ end
114
+ ```
115
+
116
+ ### Full form (recommended — retains reader state after processing)
117
+
118
+ ```ruby
119
+ reader = SmarterCSV::Reader.new('data.csv', options)
120
+
121
+ reader.each do |hash|
122
+ MyModel.upsert(hash)
123
+ end
124
+
125
+ puts reader.headers # accessible after processing
126
+ puts reader.errors.inspect
127
+ ```
128
+
129
+ ### Returns an Enumerator when called without a block
130
+
131
+ ```ruby
132
+ enum = SmarterCSV.each('data.csv', options)
133
+ enum.to_a # => [{ name: "Alice", ... }, { name: "Bob", ... }, ...]
134
+ ```
135
+
136
+ ### Enumerable methods work directly
137
+
138
+ Because `Reader` includes `Enumerable`, all standard Ruby enumerable methods work:
139
+
140
+ ```ruby
141
+ reader = SmarterCSV::Reader.new('data.csv', options)
142
+
143
+ # Filter rows
144
+ us_users = reader.select { |h| h[:country] == 'US' }
145
+
146
+ # Transform
147
+ names = reader.map { |h| h[:name] }
148
+
149
+ # Count good rows
150
+ reader.count
151
+
152
+ # Row index (0-based count of successfully parsed rows, excluding bad rows)
153
+ reader.each_with_index do |hash, i|
154
+ puts "Row #{i}: #{hash[:name]}"
155
+ end
156
+
157
+ # Free chunking via Enumerable — no chunk_size needed
158
+ reader.each_slice(100) do |batch|
159
+ MyModel.insert_all(batch)
160
+ end
161
+ ```
162
+
163
+ ### Lazy evaluation
164
+
165
+ `lazy` lets you stop early without reading the entire file:
166
+
167
+ ```ruby
168
+ # Read only the first 10 rows matching a condition
169
+ reader = SmarterCSV::Reader.new('big.csv', options)
170
+ result = reader.lazy.select { |h| h[:status] == 'active' }.first(10)
171
+ ```
172
+
173
+ ### `each` ignores `chunk_size`
174
+
175
+ If `chunk_size` is set in options, `each` ignores it and always yields individual `Hash` objects. Use [`each_chunk`](./batch_processing.md) for chunked batch processing.
176
+
177
+ ### Interaction with `on_bad_row`
178
+
179
+ `each` respects all `on_bad_row` options. Bad rows are skipped (or routed to your handler) and never yielded:
180
+
181
+ ```ruby
182
+ reader = SmarterCSV::Reader.new('data.csv', on_bad_row: :collect)
183
+ reader.each { |hash| MyModel.upsert(hash) }
184
+ reader.errors[:bad_rows].each { |rec| puts "Bad row: #{rec[:error_message]}" }
185
+ ```
186
+
187
+ ---
188
+
189
+ ## Value Transformation Pipeline
190
+
191
+ After each row is parsed, SmarterCSV applies transformations to field values in this order:
192
+
193
+ | Step | Option | Default | Description |
194
+ |------|--------|---------|-------------|
195
+ | 1 | `strip_whitespace` | `true` | Strips leading/trailing whitespace from all values (and headers) at parse time |
196
+ | 2 | `nil_values_matching` | `nil` | Sets values matching the regexp to `nil` |
197
+ | 3 | `remove_empty_values` | `true` | Removes keys whose value is `nil` or blank |
198
+ | 4 | `remove_zero_values` | `false` | Removes keys whose value is numeric zero |
199
+ | 5 | `convert_values_to_numeric` | `true` | Converts numeric-looking strings to `Integer` or `Float` |
200
+ | 6 | `value_converters` | `nil` | Applies per-key custom converter lambdas or classes |
201
+ | 7 | `remove_empty_hashes` | `true` | Drops rows that are entirely empty after all transformations |
202
+
203
+ > Steps 2–6 run per field, in that order, for every key/value pair in the row.
204
+ > `value_converters` receive the value **after** numeric conversion — guard against `Integer`/`Float` input if needed.
205
+
206
+ See [Data Transformations](./data_transformations.md) and [Value Converters](./value_converters.md) for details.
207
+
208
+ ---
209
+
210
+ ## Header Transformation Pipeline
211
+
212
+ Before any data rows are processed, the header line passes through these steps:
213
+
214
+ ```
215
+ comment_regexp → strip_chars_from_headers → split on col_sep → strip quote_char
216
+ → strip_whitespace → [gsub spaces/dashes→_ → downcase_header]
217
+ → disambiguate_headers → symbolize → key_mapping
218
+ ```
219
+
220
+ `user_provided_headers` bypasses the file header and all transformation steps — your array is used as-is.
221
+
222
+ See [Header Transformations](./header_transformations.md) for the full step-by-step table and options.
223
+
224
+ ---
225
+
84
226
  ## Rescue from Exceptions
85
227
 
86
228
  While SmarterCSV uses sensible defaults to process the most common CSV files, it will raise exceptions if it can not auto-detect `col_sep`, `row_sep`, or if it encounters other problems. Therefore please rescue from `SmarterCSV::Error`, and handle outliers according to your requirements.
87
229
 
88
- If you encounter unusual CSV files, please follow the tips in the Troubleshooting section below. You can use the options below to accomodate for unusual formats.
230
+ If you encounter unusual CSV files, please follow the tips in the Troubleshooting section below. You can use the options below to accommodate for unusual formats.
89
231
 
90
232
  ## Troubleshooting
91
233
 
@@ -102,9 +244,8 @@ $ hexdump -C spec/fixtures/bom_test_feff.csv
102
244
 
103
245
  ## Assumptions / Limitations
104
246
 
105
- * the escape character is `\`, as on UNIX and Windows systems.
106
- * quote charcters around fields are balanced, e.g. valid: `"field"`, invalid: `"field\"`
107
- e.g. an escaped `quote_char` does not denote the end of a field.
247
+ * By default, quote escaping uses `:auto` mode SmarterCSV tries backslash-escape (`\"`) first and falls back to RFC 4180 doubled-quotes (`""`). Use `quote_escaping: :double_quotes` or `:backslash` to fix the mode explicitly. See [Parsing Strategy](./parsing_strategy.md).
248
+ * Quote characters around fields are expected to be balanced, e.g. valid: `"field"`, invalid: `"field\"` — an escaped `quote_char` does not denote the end of a field.
108
249
 
109
250
 
110
251
  ## NOTES about File Encodings:
@@ -125,4 +266,5 @@ $ hexdump -C spec/fixtures/bom_test_feff.csv
125
266
  ```
126
267
 
127
268
  ----------------
128
- PREVIOUS: [Parsing Strategy](./parsing_strategy.md) | NEXT: [The Basic Write API](./basic_write_api.md)
269
+
270
+ PREVIOUS: [Parsing Strategy](./parsing_strategy.md) | NEXT: [The Basic Write API](./basic_write_api.md) | UP: [README](../README.md)