smarter_csv 1.17.1 → 1.17.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +246 -63
- data/CONTRIBUTORS.md +2 -1
- data/README.md +6 -3
- data/UPGRADING.md +251 -0
- data/docs/.nojekyll +0 -0
- data/docs/upgrade_path.json +175 -0
- data/docs/upgrade_wizard.html +498 -0
- data/ext/smarter_csv/smarter_csv.c +248 -323
- data/lib/smarter_csv/parser.rb +40 -12
- data/lib/smarter_csv/version.rb +1 -1
- data/smarter_csv.gemspec +7 -5
- metadata +8 -3
- data/TO_DO.md +0 -109
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: ec50e8539c6872f9c86c25eabc2982e39846ad07dc5a21021fc687c7661f8084
|
|
4
|
+
data.tar.gz: 977ce04d8dd225b6042ea03ad0c174305f3ea122340fad052e2c2ada440d6400
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 0452dc7f15ab31b0cdfad83ca718e17e6456cf6c9826d177e606c5924f3ec72a155c86ee6f9f938540fe3b2ed8f694a981c95cf775b5a38d7f7e44318bc453a3
|
|
7
|
+
data.tar.gz: c1c9732d6d4393fb2ffa995f0c7bb73cd60f566132d487a86007f8a5a623257365c2324ed894a5b38d681df7cfab67069d9fa2e61fd525ba51675954ddadad7a
|
data/CHANGELOG.md
CHANGED
|
@@ -1,8 +1,46 @@
|
|
|
1
1
|
|
|
2
2
|
# SmarterCSV 1.x Change Log
|
|
3
3
|
|
|
4
|
+
> [!TIP]
|
|
5
|
+
> **Upgrading?** The [SmarterCSV Upgrade Wizard](https://tilo.github.io/smarter_csv/upgrade_wizard.html) walks you through what (if anything) you need to change for your specific version. Most hops do not require any changes.
|
|
4
6
|
|
|
5
|
-
## 1.17.
|
|
7
|
+
## 1.17.3 (2026-05-26)
|
|
8
|
+
|
|
9
|
+
RSpec tests: **2,274→ 2,277** (+3 tests)
|
|
10
|
+
|
|
11
|
+
* No functional changes
|
|
12
|
+
* added 3 test cases
|
|
13
|
+
|
|
14
|
+
### Improvements
|
|
15
|
+
* DRY-up C-code
|
|
16
|
+
* no performance changes on the C-path
|
|
17
|
+
|
|
18
|
+
### Performance
|
|
19
|
+
* performance improvement on the Ruby-path
|
|
20
|
+
|
|
21
|
+
| File | RB-path |
|
|
22
|
+
|-----------------------------------|--------------|
|
|
23
|
+
| PEOPLE_IMPORT_B / PEOPLE_IMPORT_C | 13.5% faster |
|
|
24
|
+
| tab_separated_60k | 13.2% faster |
|
|
25
|
+
| sample_100k | 10.3% faster |
|
|
26
|
+
| multi_char_separator | 9.0% faster |
|
|
27
|
+
| utf8_multibyte | 7.1% faster |
|
|
28
|
+
| many_empty_fields | 6.7% faster |
|
|
29
|
+
| PEOPLE_IMPORT_NC | 5.2% faster |
|
|
30
|
+
| sensor_data | 4.5% faster |
|
|
31
|
+
|
|
32
|
+
|
|
33
|
+
## 1.17.2 (2026-05-21)
|
|
34
|
+
|
|
35
|
+
RSpec tests: **2,220→ 2,274** (+54 tests)
|
|
36
|
+
|
|
37
|
+
### Bug Fixes
|
|
38
|
+
|
|
39
|
+
- fixed [Issue #334](https://github.com/tilo/smarter_csv/issues/334) with escaped double quote followed by comma. Thanks to [conorg](https://github.com/conorg)
|
|
40
|
+
- fixed bug when using `headers: { except: }`
|
|
41
|
+
- added more tests
|
|
42
|
+
|
|
43
|
+
## 1.17.1 (2026-05-17)
|
|
6
44
|
|
|
7
45
|
RSpec tests: **2,210→ 2,220** (+10 tests)
|
|
8
46
|
|
|
@@ -64,7 +102,17 @@ Measured against 1.16.4 (Apple M4, Ruby 3.4.7):
|
|
|
64
102
|
|
|
65
103
|
Per-file breakdown: [`docs/releases/1.17.0/performance_notes.md`](docs/releases/1.17.0/performance_notes.md).
|
|
66
104
|
|
|
67
|
-
## 1.16.
|
|
105
|
+
## 1.16.6 (2026-05-21)
|
|
106
|
+
|
|
107
|
+
RSpec tests: **1,467 → 1,591** (+124 tests)
|
|
108
|
+
|
|
109
|
+
### Bug Fixes
|
|
110
|
+
|
|
111
|
+
- fixed [Issue #334](https://github.com/tilo/smarter_csv/issues/334) with escaped double quote followed by comma. Thanks to [conorg](https://github.com/conorg)
|
|
112
|
+
- fixed bug when using `headers: { except: }`
|
|
113
|
+
- added more tests
|
|
114
|
+
|
|
115
|
+
## 1.16.5 (2026-05-17)
|
|
68
116
|
|
|
69
117
|
### Bug Fix
|
|
70
118
|
|
|
@@ -155,23 +203,42 @@ RSpec tests: **1,247 → 1,410** (+163 tests)
|
|
|
155
203
|
|
|
156
204
|
* Added 163 tests covering new features and corner cases
|
|
157
205
|
|
|
158
|
-
## 1.16.0 (2026-03-12) —
|
|
206
|
+
## 1.16.0 (2026-03-12) — improved RFC 4180 quote handling, new APIs, large performance gains
|
|
159
207
|
|
|
160
208
|
[Full details](docs/releases/1.16.0/changes.md) · [Benchmarks](docs/releases/1.16.0/benchmarks.md) · [Performance notes](docs/releases/1.16.0/performance_notes.md)
|
|
161
209
|
|
|
162
210
|
RSpec tests: **714 → 1,247** (+533 tests)
|
|
163
211
|
|
|
164
|
-
###
|
|
212
|
+
### (Bug Fix) `quote_boundary:` — new default for how mid-field quotes are handled
|
|
213
|
+
|
|
214
|
+
**In short — most users will see incorrect output silently improve. If your CSV files don't contain stray `"` characters in the middle of unquoted fields, you are not affected. If they do, the new default produces correct output where the old default produced corrupted output.**
|
|
215
|
+
|
|
216
|
+
A new option `quote_boundary:` controls when a `"` character marks the start or end of a quoted field versus when it's a literal character inside the field.
|
|
165
217
|
|
|
166
|
-
|
|
167
|
-
*
|
|
168
|
-
mid-field quotes are treated as literal characters.
|
|
218
|
+
* `quote_boundary: :standard` (the new default) — quotes are only recognized as field delimiters at field boundaries (start of a field, or immediately before `col_sep` / end of line). A `"` that appears in the middle of an unquoted field is treated as a literal character. This matches RFC 4180 and Ruby's standard `CSV` library.
|
|
219
|
+
* `quote_boundary: :legacy` — **not recommended.** Restores the pre-1.16.0 behavior, where any `"` could open a quoted region. This is the behavior that produced silently corrupt output on files with stray mid-field quotes; it exists only as an escape hatch for code that built workarounds on top of the buggy output. New code should never use this.
|
|
169
220
|
|
|
170
|
-
|
|
171
|
-
were already producing silently corrupt output in previous versions — so most users will see
|
|
172
|
-
correct behavior improve, not regress.
|
|
221
|
+
In practice, the old `:legacy` behavior was silently producing corrupt output whenever a CSV file contained a stray mid-field `"` — so for most users this change makes output **correct** where it was wrong before, not the other way around.
|
|
173
222
|
|
|
174
|
-
|
|
223
|
+
#### You are NOT affected if:
|
|
224
|
+
- Your CSV files don't contain any `"` characters mid-field (the common case).
|
|
225
|
+
- Your CSV files quote fields cleanly per RFC 4180 (well-formed `"..."` around each quoted field, no stray quotes inside other fields).
|
|
226
|
+
|
|
227
|
+
#### You are affected if:
|
|
228
|
+
- Your CSV files contain stray `"` characters in the middle of unquoted fields (e.g. `5'6"`, `Joe "the Hat" Smith` without surrounding quotes), **and** you had downstream code that compensated for the previously-corrupted parse output.
|
|
229
|
+
|
|
230
|
+
#### How to migrate
|
|
231
|
+
|
|
232
|
+
For almost everyone: do nothing. Upgrade and observe that the output is the same or more correct.
|
|
233
|
+
|
|
234
|
+
The `quote_boundary: :legacy` option exists only as a short-term escape hatch — **we do not advise using it**, because it re-enables the buggy parse behavior that motivated this change. If your code built workarounds on top of the previously-corrupted output, the right fix is to remove those workarounds and rely on the new `:standard` behavior, not to opt back into the bug:
|
|
235
|
+
|
|
236
|
+
```ruby
|
|
237
|
+
# Only as a temporary escape hatch — not recommended for new code:
|
|
238
|
+
SmarterCSV.process('file.csv', quote_boundary: :legacy)
|
|
239
|
+
```
|
|
240
|
+
|
|
241
|
+
See [Parsing Strategy](docs/parsing_strategy.md) for details on how each mode handles edge cases.
|
|
175
242
|
|
|
176
243
|
### Performance
|
|
177
244
|
|
|
@@ -222,7 +289,7 @@ Measured on 19 benchmark files, Apple M1, Ruby 3.4.7. See [benchmarks](docs/rele
|
|
|
222
289
|
* **Writer temp file** no longer hardcoded to `/tmp` (fixes Windows); properly cleaned up with `Tempfile#close!`.
|
|
223
290
|
* **Writer `StringIO`**: `finalize` no longer attempts to close a caller-owned `StringIO`.
|
|
224
291
|
|
|
225
|
-
## 1.15.3 (2026-05-
|
|
292
|
+
## 1.15.3 (2026-05-17)
|
|
226
293
|
|
|
227
294
|
### Bug Fix
|
|
228
295
|
|
|
@@ -390,44 +457,90 @@ _worldcities.csv is [from here](https://simplemaps.com/data/world-cities)_
|
|
|
390
457
|
## 1.13.1 (2024-12-12)
|
|
391
458
|
* fix bug with SmarterCSV.generate with `force_quotes: true` ([issue 294](https://github.com/tilo/smarter_csv/issues/294))
|
|
392
459
|
|
|
393
|
-
## 1.13.0 (2024-11-06)
|
|
394
|
-
|
|
395
|
-
CHANGED DEFAULT BEHAVIOR
|
|
396
|
-
========================
|
|
397
|
-
The changes are to improve robustness and to reduce the risk of data loss
|
|
460
|
+
## 1.13.0 (2024-11-06) — Three default-behavior changes that prevent silent data loss
|
|
398
461
|
|
|
399
|
-
|
|
462
|
+
This release flipped three defaults so that SmarterCSV no longer silently loses data in three specific edge cases. For most users this is a quiet improvement — files that used to lose rows or columns silently now parse correctly with no code changes. Each change below has a short "affected if / not affected if" so you can skip past it quickly.
|
|
400
463
|
|
|
401
|
-
|
|
402
|
-
-> SmarterCSV will now raise `SmarterCSV::MalformedCSV` for unbalanced quote_char.
|
|
464
|
+
The motivation for all three changes is the same: data loss should never be silent. Either parse it correctly, or raise loudly.
|
|
403
465
|
|
|
404
|
-
|
|
405
|
-
|
|
406
|
-
* previous behavior:
|
|
407
|
-
when a CSV row had more columns than listed in the header, the additional columns were ignored
|
|
466
|
+
### Change 1 (Bug Fix): extra columns in a row are auto-named instead of dropped
|
|
408
467
|
|
|
409
|
-
|
|
410
|
-
* new default behavior is to auto-generate additional headers, e.g. :column_7, :column_8, etc
|
|
411
|
-
* you can set option `:strict` to true in order to get a `SmarterCSV::MalformedCSV` exception instead
|
|
468
|
+
(Thanks to James Fenley, [issue #284](https://github.com/tilo/smarter_csv/issues/284).)
|
|
412
469
|
|
|
413
|
-
|
|
414
|
-
|
|
415
|
-
|
|
470
|
+
If a CSV row had more columns than the header (e.g. header has 6 columns, a row has 8), the extras used to be **silently dropped**. As of 1.13.0 they survive as `:column_7`, `:column_8`, etc.
|
|
471
|
+
|
|
472
|
+
#### You are NOT affected if:
|
|
473
|
+
- Your CSV files have exactly as many columns per row as headers (the common case).
|
|
474
|
+
|
|
475
|
+
#### You are affected if:
|
|
476
|
+
- Your CSV files have rows with extra columns past the header **and** your code expects only the header-listed keys.
|
|
477
|
+
|
|
478
|
+
#### How to migrate
|
|
479
|
+
|
|
480
|
+
If you want the old "ignore extras" behavior, drop the extra keys yourself. If you want loud failure instead, use the strict mode:
|
|
481
|
+
|
|
482
|
+
```ruby
|
|
483
|
+
# Raise SmarterCSV::MalformedCSV on extra columns:
|
|
484
|
+
SmarterCSV.process('file.csv', strict: true)
|
|
485
|
+
```
|
|
486
|
+
|
|
487
|
+
(In 1.16.0 this option was renamed to `missing_headers: :raise`, but `strict: true` still works.)
|
|
488
|
+
|
|
489
|
+
### Change 2 (Bug Fix): unbalanced quotes raise `MalformedCSV` instead of producing garbage
|
|
490
|
+
|
|
491
|
+
(Thanks to Simon Rentzke, James Fenley, Randall B, and Matthew Kennedy. Issues [#283](https://github.com/tilo/smarter_csv/issues/283), [#288](https://github.com/tilo/smarter_csv/issues/288).)
|
|
492
|
+
|
|
493
|
+
Files with an unbalanced `quote_char` (an opening `"` with no matching close) used to parse to corrupted output. As of 1.13.0 they raise `SmarterCSV::MalformedCSV`.
|
|
494
|
+
|
|
495
|
+
#### You are NOT affected if:
|
|
496
|
+
- Your CSV files have well-formed quotes (the common case).
|
|
497
|
+
|
|
498
|
+
#### You are affected if:
|
|
499
|
+
- Some of your input files have unbalanced quotes and you used to silently live with the garbled output.
|
|
500
|
+
|
|
501
|
+
#### How to migrate
|
|
416
502
|
|
|
417
|
-
|
|
503
|
+
If you need to keep processing other files even when one is malformed, rescue the new exception:
|
|
418
504
|
|
|
419
|
-
|
|
420
|
-
|
|
421
|
-
|
|
505
|
+
```ruby
|
|
506
|
+
begin
|
|
507
|
+
SmarterCSV.process('file.csv')
|
|
508
|
+
rescue SmarterCSV::MalformedCSV => e
|
|
509
|
+
warn "Skipping malformed file: #{e.message}"
|
|
510
|
+
end
|
|
511
|
+
```
|
|
422
512
|
|
|
423
|
-
|
|
424
|
-
Setting `user_provided_headers` sets`headers_in_file: false`
|
|
425
|
-
a) Improved behavior if there was no header in the input data.
|
|
426
|
-
b) If there was a header in the input data, and `user_provided_headers` is used to override the headers in the file, then please explicitly specify `headers_in_file: true`, otherwise you will get an extra hash which includes the header data.
|
|
513
|
+
### Change 3 (Bug Fix): `user_provided_headers:` now implies `headers_in_file: false`
|
|
427
514
|
|
|
428
|
-
|
|
515
|
+
([Issue #282](https://github.com/tilo/smarter_csv/issues/282).)
|
|
429
516
|
|
|
430
|
-
|
|
517
|
+
This one fixes a quiet footgun: if you passed `user_provided_headers:` and the file had **no** header row, SmarterCSV used to treat the first data row as a header and silently drop it. As of 1.13.0, setting `user_provided_headers:` automatically sets `headers_in_file: false`, so the first row is treated as data — which is what you almost always wanted.
|
|
518
|
+
|
|
519
|
+
#### You are NOT affected if:
|
|
520
|
+
- You don't use `user_provided_headers:`.
|
|
521
|
+
- You use `user_provided_headers:` with files that have no header line (the common case — that's what the option is for).
|
|
522
|
+
|
|
523
|
+
#### You are affected if:
|
|
524
|
+
- You pass `user_provided_headers:` **and** your CSV file **does** have a header line that needs to be skipped.
|
|
525
|
+
|
|
526
|
+
#### How to migrate
|
|
527
|
+
|
|
528
|
+
If your file has a header line **and** you're overriding it with `user_provided_headers:`, add `headers_in_file: true` explicitly so the existing header line is skipped:
|
|
529
|
+
|
|
530
|
+
```ruby
|
|
531
|
+
# File has a header row that you want to override:
|
|
532
|
+
SmarterCSV.process(
|
|
533
|
+
'file.csv',
|
|
534
|
+
user_provided_headers: [:id, :name, :email],
|
|
535
|
+
headers_in_file: true, # skip the header row in the file
|
|
536
|
+
)
|
|
537
|
+
```
|
|
538
|
+
|
|
539
|
+
Without `headers_in_file: true`, you will get an extra hash at the top of your results containing the file's original header strings as values — that's the symptom to look for.
|
|
540
|
+
|
|
541
|
+
### Documentation
|
|
542
|
+
|
|
543
|
+
* Improved documentation for handling numeric columns with leading zeroes (e.g. ZIP codes). Use `convert_values_to_numeric: { except: [:zip] }` to keep that column as a string. (Available since 1.10.x.) Thanks to David Moles, [issue #151](https://github.com/tilo/smarter_csv/issues/151).
|
|
431
544
|
|
|
432
545
|
## 1.12.1 (2024-07-10)
|
|
433
546
|
* Improved column separator detection by ignoring quoted sections [#276](https://github.com/tilo/smarter_csv/pull/276) (thanks to Nicolas Castellanos)
|
|
@@ -481,23 +594,66 @@ _worldcities.csv is [from here](https://simplemaps.com/data/world-cities)_
|
|
|
481
594
|
## 1.10.1 (2024-01-07)
|
|
482
595
|
* fix incorrect warning about UTF-8 (issue #268, thanks hirowatari)
|
|
483
596
|
|
|
484
|
-
## 1.10.0 (2023-12-31)
|
|
597
|
+
## 1.10.0 (2023-12-31) — Behavior changes for `user_provided_headers:` and duplicate headers
|
|
485
598
|
|
|
486
|
-
|
|
487
|
-
|
|
488
|
-
|
|
489
|
-
|
|
490
|
-
|
|
491
|
-
|
|
492
|
-
|
|
493
|
-
|
|
494
|
-
|
|
495
|
-
|
|
496
|
-
|
|
497
|
-
|
|
498
|
-
|
|
499
|
-
|
|
500
|
-
|
|
599
|
+
Two small behavior changes plus performance and memory improvements. Most users are not affected. Read on for who needs to look closer.
|
|
600
|
+
|
|
601
|
+
### Change 1 (Improvement): `user_provided_headers:` is now taken literally (no transformations, no duplicates)
|
|
602
|
+
|
|
603
|
+
**In short — if you use `user_provided_headers:`, write the list in the exact form you want the result keys (all symbols *or* all strings), and make sure there are no duplicates. For most users this is already what you were doing.**
|
|
604
|
+
|
|
605
|
+
Before 1.10.0, any list you passed as `user_provided_headers:` was run through the same header pipeline as in-file headers — `strings_as_keys` could flip strings to symbols, etc. Duplicates were silently accepted. As of 1.10.0, the list is used **literally**: no transformations are applied, and duplicates raise `SmarterCSV::DuplicateHeaders`.
|
|
606
|
+
|
|
607
|
+
This is almost always what people actually wanted: if you're explicitly listing the headers, you want *those* headers, not a transformed version of them.
|
|
608
|
+
|
|
609
|
+
#### You are NOT affected if:
|
|
610
|
+
- You don't use `user_provided_headers:`.
|
|
611
|
+
- Your `user_provided_headers:` list is already in the form you want (all symbols *or* all strings, no duplicates).
|
|
612
|
+
In these cases, you can just upgrade without any code changes.
|
|
613
|
+
|
|
614
|
+
#### You are affected if either is true:
|
|
615
|
+
- You pass `user_provided_headers:` **and** relied on `strings_as_keys:` to flip between string/symbol keys.
|
|
616
|
+
- You pass `user_provided_headers:` **and** had accidental duplicates in the list that the library used to silently accept (this case would be very odd).
|
|
617
|
+
|
|
618
|
+
#### How to migrate
|
|
619
|
+
|
|
620
|
+
```ruby
|
|
621
|
+
# If you want symbol keys, write symbols directly:
|
|
622
|
+
SmarterCSV.process('file.csv', user_provided_headers: [:id, :name, :email])
|
|
623
|
+
|
|
624
|
+
# If you want string keys, write strings directly:
|
|
625
|
+
SmarterCSV.process('file.csv', user_provided_headers: ['id', 'name', 'email'])
|
|
626
|
+
```
|
|
627
|
+
|
|
628
|
+
Drop any `strings_as_keys:` option you used alongside `user_provided_headers:` — it's ignored in that case now.
|
|
629
|
+
|
|
630
|
+
If you see `SmarterCSV::DuplicateHeaders` after upgrading, your list has a repeat in it — fix the duplicate and you're done.
|
|
631
|
+
|
|
632
|
+
### Change 2 (Improvement): duplicate headers in the CSV file are now auto-disambiguated
|
|
633
|
+
|
|
634
|
+
**In short — if your input CSV has duplicate column headers, they now Just Work instead of colliding. If your files don't have duplicate headers, you are not affected.**
|
|
635
|
+
|
|
636
|
+
`duplicate_header_suffix:` used to default to `nil`. Now it defaults to `''` (empty string), which means a file with headers like `name,name,name` becomes keys `name`, `name2`, `name3` automatically — no more silently overwriting earlier columns.
|
|
637
|
+
|
|
638
|
+
#### You are affected if:
|
|
639
|
+
- You depended on SmarterCSV raising or failing fast when a CSV has duplicate headers (e.g. as a data-quality check at the boundary of your pipeline).
|
|
640
|
+
|
|
641
|
+
#### You are NOT affected if:
|
|
642
|
+
- Your CSVs don't have duplicate headers.
|
|
643
|
+
- You already explicitly set `duplicate_header_suffix:` in your code.
|
|
644
|
+
|
|
645
|
+
#### How to migrate
|
|
646
|
+
|
|
647
|
+
If you want the old strict behavior, set the option explicitly to `nil`:
|
|
648
|
+
|
|
649
|
+
```ruby
|
|
650
|
+
SmarterCSV.process('file.csv', duplicate_header_suffix: nil)
|
|
651
|
+
```
|
|
652
|
+
|
|
653
|
+
### Other
|
|
654
|
+
|
|
655
|
+
* Performance and memory improvements
|
|
656
|
+
* Internal code refactor
|
|
501
657
|
|
|
502
658
|
## 1.9.3 (2023-12-16)
|
|
503
659
|
* raise SmarterCSV::IncorrectOption when `user_provided_headers` are empty
|
|
@@ -635,13 +791,40 @@ _worldcities.csv is [from here](https://simplemaps.com/data/world-cities)_
|
|
|
635
791
|
* fixed buggy behavior when using `remove_empty_values: false` (issue #168)
|
|
636
792
|
* fixed Ruby 3.0 deprecation
|
|
637
793
|
|
|
638
|
-
## 1.3.0 (2022-02-06)
|
|
639
|
-
|
|
640
|
-
|
|
794
|
+
## 1.3.0 (2022-02-06)
|
|
795
|
+
|
|
796
|
+
### (Bug Fix) Small change for users of the `key_mapping:` option (issue #181)
|
|
797
|
+
|
|
798
|
+
**In short — if you use `key_mapping:`, this is a one-character fix per mapping. If you don't use `key_mapping:`, you are not affected.**
|
|
799
|
+
|
|
800
|
+
Previously, the values in a `key_mapping:` hash were silently coerced to symbols, so `'new_name'` and `:new_name` produced the same result key. As of 1.3.0, the values are used as-is — strings stay strings, symbols stay symbols. This gives you direct control over whether the result hashes use string or symbol keys.
|
|
801
|
+
|
|
802
|
+
#### You are NOT affected if any of these are true:
|
|
803
|
+
- You don't use `key_mapping:`.
|
|
804
|
+
- Your `key_mapping:` already uses symbol values (e.g. `:new_name`).
|
|
805
|
+
- Your downstream code already reads result hashes with string keys.
|
|
806
|
+
In these cases, you can just upgrade without any code changes.
|
|
807
|
+
|
|
808
|
+
#### You are affected if all three are true:
|
|
809
|
+
- You pass `key_mapping:` to `SmarterCSV.process` (or `process_csv` in older code), **and**
|
|
810
|
+
- The values in that hash are strings (e.g. `'new_name'`, not `:new_name`), **and**
|
|
811
|
+
- Your downstream code reads the result hashes with symbol keys (e.g. `row[:new_name]`).
|
|
812
|
+
This needs a small code-change
|
|
813
|
+
|
|
814
|
+
#### How to migrate
|
|
815
|
+
|
|
816
|
+
Pick whichever is the smaller diff in your code:
|
|
817
|
+
|
|
818
|
+
```ruby
|
|
819
|
+
# Option A — keep symbol keys in the result (one extra colon per line):
|
|
820
|
+
SmarterCSV.process('file.csv', key_mapping: { 'Old Header' => :new_name })
|
|
821
|
+
# ^ add the colon
|
|
822
|
+
|
|
823
|
+
# Option B — switch your reads to string keys:
|
|
824
|
+
row['new_name'] # instead of row[:new_name]
|
|
825
|
+
```
|
|
641
826
|
|
|
642
|
-
|
|
643
|
-
* either use symbols in the `key_mapping` hash
|
|
644
|
-
* or change the expected keys from symbols to strings
|
|
827
|
+
That's the whole migration. Everything else in 1.3.0 is source-compatible with 1.2.x.
|
|
645
828
|
|
|
646
829
|
## 1.2.9 (2021-11-22) (PULLED)
|
|
647
830
|
* fix bug for key_mappings (issue #181)
|
|
@@ -668,7 +851,7 @@ _worldcities.csv is [from here](https://simplemaps.com/data/world-cities)_
|
|
|
668
851
|
* bugfix (thanks to Joshua Smith for reporting)
|
|
669
852
|
|
|
670
853
|
## 1.2.0 (2018-01-20)
|
|
671
|
-
* add default validation that a header can only appear once
|
|
854
|
+
* add default validation that a header can only appear once; raises `SmarterCSV::DuplicateHeaders` when it doesn't
|
|
672
855
|
* add option `required_headers`
|
|
673
856
|
|
|
674
857
|
## 1.1.5 (2017-11-05)
|
data/CONTRIBUTORS.md
CHANGED
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
# A Big Thank You to all
|
|
1
|
+
# A Big Thank You to all 64 Contributors!!
|
|
2
2
|
|
|
3
3
|
|
|
4
4
|
A Big Thank you to everyone who filed issues, sent comments, and who contributed with pull requests:
|
|
@@ -66,3 +66,4 @@ A Big Thank you to everyone who filed issues, sent comments, and who contributed
|
|
|
66
66
|
* [Dom Lebron](https://github.com/biglebronski)
|
|
67
67
|
* [Paho Lurie-Gregg](https://github.com/paholg)
|
|
68
68
|
* [Jonas Staškevičius](https://github.com/pirminis)
|
|
69
|
+
* [conorg](https://github.com/conorg)
|
data/README.md
CHANGED
|
@@ -1,8 +1,11 @@
|
|
|
1
1
|
|
|
2
2
|
# SmarterCSV
|
|
3
3
|
|
|
4
|
-
|
|
4
|
+
 [](https://codecov.io/gh/tilo/smarter_csv) [](https://rubygems.org/gems/smarter_csv) [](https://rubygems.org/gems/smarter_csv) [](https://www.ruby-toolbox.com/projects/smarter_csv) [](https://tilo.github.io/smarter_csv/upgrade_wizard.html)
|
|
5
5
|
|
|
6
|
+
> [!TIP]
|
|
7
|
+
> **Upgrading from an older version?** Use the [SmarterCSV Upgrade Wizard](https://tilo.github.io/smarter_csv/upgrade_wizard.html) to walk through what (if anything) you need to change for your specific version. Most hops do not require any changes.
|
|
8
|
+
|
|
6
9
|
SmarterCSV is a high-performance CSV ingestion and generation for Ruby, focused on fast end-to-end CSV ingestion of real-world data — no silent failures, no surprises, not just tokenization.
|
|
7
10
|
|
|
8
11
|
⭐ If SmarterCSV saved you hours of import time, please star the repo, and consider sponsoring this project.
|
|
@@ -311,7 +314,7 @@ Or install it yourself as:
|
|
|
311
314
|
* [Examples](docs/examples.md)
|
|
312
315
|
* [Real-World CSV Files](docs/real_world_csv.md)
|
|
313
316
|
* [SmarterCSV over the Years](docs/history.md)
|
|
314
|
-
* [Release Notes](docs/releases/1.
|
|
317
|
+
* [Release Notes](docs/releases/1.17.0/changes.md)
|
|
315
318
|
|
|
316
319
|
## Articles
|
|
317
320
|
* [Parsing CSV Files in Ruby with SmarterCSV](https://tilo-sloboda.medium.com/parsing-csv-files-in-ruby-with-smartercsv-6ce66fb6cf38)
|
|
@@ -333,7 +336,7 @@ For reporting issues, please:
|
|
|
333
336
|
* open a pull-request adding a test that demonstrates the issue
|
|
334
337
|
* mention your version of SmarterCSV, Ruby, Rails
|
|
335
338
|
|
|
336
|
-
# [A Special Thanks to all
|
|
339
|
+
# [A Special Thanks to all 64 Contributors!](CONTRIBUTORS.md) 🎉🎉🎉
|
|
337
340
|
|
|
338
341
|
|
|
339
342
|
## Contributing
|