smarter_csv 1.11.0.pre2 → 1.11.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 9027a37c4b29e68fcbc559a6a8285e5076684883612e98b13f116526dadc6e4b
4
- data.tar.gz: 43cfa0254ac2caa8ca02a8863fc790f1c56beb0b16e175a16fd947f92eda8c08
3
+ metadata.gz: 996d8195dc01722461990cb51dbc2b65d4cf2edf93eff2c1d4c5359614911b92
4
+ data.tar.gz: 0f193b450b4cfb97a1f11bbeacbc659b5220fa545751533a225347349943bc9a
5
5
  SHA512:
6
- metadata.gz: d5b2eac35e33bdeb9ec632578207c47f34a08a4595c1ebf04929a7fe302efc3fc08565c0d6fc6454fd538b99cc1c4a5662599a88693e3aa1b80c5c0c7fc1b05e
7
- data.tar.gz: 36f622f12d5412ef8919c30a267def315985aef4b3f33c209341a046442795bb66df6c8ebb97399f35e24fd565df658a0e4b7a94d3563831d7ce32facbeab33f
6
+ metadata.gz: 1a8141ccf75d5f6edf5ffc6f4b719f36dabfc826903b69804cfda09988f43fce24678e7e44fe53e0228a3077858fa870b920bf019b9dd996c53d9024364767ca
7
+ data.tar.gz: 25daf165cb30decb0f7cd5703758dde5879a12398e6826fb8e0def3efb135b33ab8ef2bf26149d0bd887d42d73a19f9af8ab5e7ea1aee737e139a3acf5da6a4f
data/.rspec ADDED
@@ -0,0 +1 @@
1
+ --require spec_helper
data/.rubocop.yml CHANGED
@@ -88,18 +88,12 @@ Style/IfInsideElse:
88
88
  Style/IfUnlessModifier:
89
89
  Enabled: false
90
90
 
91
- Style/InverseMethods:
92
- Enabled: false
93
-
94
91
  Style/NestedTernaryOperator:
95
92
  Enabled: false
96
93
 
97
94
  Style/PreferredHashMethods:
98
95
  Enabled: false
99
96
 
100
- Style/Proc:
101
- Enabled: false
102
-
103
97
  Style/NumericPredicate:
104
98
  Enabled: false
105
99
 
@@ -118,6 +112,9 @@ Style/SlicingWithRange:
118
112
  Style/SpecialGlobalVars: # DANGER: unsafe rule!!
119
113
  Enabled: false
120
114
 
115
+ Style/StringConcatenation:
116
+ Enabled: false
117
+
121
118
  Style/StringLiterals:
122
119
  Enabled: false
123
120
  EnforcedStyle: double_quotes
@@ -135,9 +132,6 @@ Style/SymbolProc: # old Ruby versions can't do this
135
132
  Style/TrailingCommaInHashLiteral:
136
133
  Enabled: false
137
134
 
138
- Style/TrailingCommaInArrayLiteral:
139
- Enabled: false
140
-
141
135
  Style/TrailingUnderscoreVariable:
142
136
  Enabled: false
143
137
 
@@ -147,9 +141,6 @@ Style/TrivialAccessors:
147
141
  # Style/UnlessModifier:
148
142
  # Enabled: false
149
143
 
150
- Style/WordArray:
151
- Enabled: false
152
-
153
144
  Style/ZeroLengthPredicate:
154
145
  Enabled: false
155
146
 
data/CHANGELOG.md CHANGED
@@ -1,42 +1,16 @@
1
1
 
2
2
  # SmarterCSV 1.x Change Log
3
3
 
4
- ## T.B.D.
5
-
6
- * code refactor
4
+ ## 1.11.0 (2024-07-02)
5
+ * added SmarterCSV::Writer to output CSV files ([issue #44](https://github.com/tilo/smarter_csv/issues/44))
7
6
 
8
- * NEW BEHAVIOR:
9
- - hidden `:v2_mode` options (incomplete!)
10
- - pre-processing for v2 options
11
- - implemented v2 `:header_transformations` (DO NOT USE YET!)
12
- + -> check if all v1 transformations are correctly done
13
- How are we going to
14
- * disambiguate headers?
15
-
16
-
17
- * do key_mapping? -> seems to work
18
- - remove_unmapped_keys ?
19
- - silence missing keys ... a missing mapped key should raise an exception, except when silenced
20
- - required_keys needs to be a header-validation
21
-
7
+ ## 1.10.3 (2024-03-10)
8
+ * fixed issue when frozen options are handed in (thanks to Daniel Pepper)
9
+ * cleaned-up rspec tests (thanks to Daniel Pepper)
10
+ * fixed link in README (issue #251)
22
11
 
23
- * keep original headers? -> :none
24
- * do strings_as_* ? -> either :keys_as_symbols, :keys_as_strings
25
- * remove quote_chars? -> included in keys_as_*
26
- * strip whitespace? -> included in keys_as_*
27
-
28
- TODO:
29
-
30
- - add tests for header_validations
31
-
32
- - modify options to handle v1 and v2 options
33
- - add v1 defaults in v2 processing
34
- - add tests for all options processing
35
- - 100% backwards compatibility when working in v1 mode
36
-
37
-
38
- ## 1.10.1 (2024-01-07)
39
- * fix incorrect warning about UTF-8 (issue #268, thanks hirowatari)
12
+ ## 1.10.2 (2024-02-11)
13
+ * improve error message for missing keys
40
14
 
41
15
  ## 1.10.1 (2024-01-07)
42
16
  * fix incorrect warning about UTF-8 (issue #268, thanks hirowatari)
data/CONTRIBUTORS.md CHANGED
@@ -51,4 +51,5 @@ A Big Thank you to everyone who filed issues, sent comments, and who contributed
51
51
  * [Rahul Chaudhary](https://github.com/rahulch95)
52
52
  * [Alessandro Fazzi](https://github.com/pioneerskies)
53
53
  * [JP Camara](https://github.com/jpcamara)
54
- * [Hiro Watari](https://github.com/hirowatari)
54
+ * [Kenton Hirowatari](https://github.com/hirowatari)
55
+ * [Daniel Pepper](https://github.com/dpep)
data/README.md CHANGED
@@ -3,10 +3,11 @@
3
3
 
4
4
  [![codecov](https://codecov.io/gh/tilo/smarter_csv/branch/main/graph/badge.svg?token=1L7OD80182)](https://codecov.io/gh/tilo/smarter_csv) [![Gem Version](https://badge.fury.io/rb/smarter_csv.svg)](http://badge.fury.io/rb/smarter_csv)
5
5
 
6
+ This library provides a complete interface to CSV files and data. It offers tools to enable you to read and write to and from Strings or IO objects, as needed.
6
7
 
7
- #### LATEST CHANGES
8
+ #### BREAKING CHANGES
8
9
 
9
- * Version 1.10.0 has BREAKING CHANGES:
10
+ * Version 1.10.0 had BREAKING CHANGES:
10
11
 
11
12
  Changed behavior:
12
13
  + when `user_provided_headers` are provided:
@@ -23,13 +24,8 @@
23
24
 
24
25
  * default branch is `main` for 1.x development
25
26
 
26
- * 2.x development is on `2.0-development` (check this branch for 2.0 documentation)
27
- - This is an EXPERIMENTAL branch - DO NOT USE in production
28
-
29
- #### Work towards Future Version 2.x
30
-
31
- * Work towards SmarterCSV 2.x is still ongoing, with improved features, and more streamlined options, but consider it as experimental at this time.
32
- Please check the [2.0-develop branch](https://github.com/tilo/smarter_csv/tree/2.0-develop), open any issues and pull requests with mention of tag v2.0.
27
+ * 2.x development is [MOVED TO THIS PR](https://github.com/tilo/smarter_csv/pull/267)
28
+ - 2.x behavior is still EXPERIMENTAL - DO NOT USE in production
33
29
 
34
30
  ---------------
35
31
 
@@ -394,10 +390,9 @@ And header and data validations will also be supported in 2.x
394
390
  * some CSV files use un-escaped quotation characters inside fields. This can cause the import to break. To get around this, use the `:force_simple_split => true` option in combination with `:strip_chars_from_headers => /[\-"]/` . This will also significantly speed up the import.
395
391
  If you would force a different :quote_char instead (setting it to a non-used character), then the import would be up to 5-times slower than using `:force_simple_split`.
396
392
 
397
- ## See also:
398
-
399
- http://www.unixgods.org/~tilo/Ruby/process_csv_as_hashes.html
393
+ ## The original post that started SmarterCSV:
400
394
 
395
+ http://www.unixgods.org/Ruby/process_csv_as_hashes.html
401
396
 
402
397
 
403
398
  ## Installation
@@ -2,16 +2,7 @@
2
2
 
3
3
  module SmarterCSV
4
4
  class << self
5
- # this is processing the headers from the input file
6
5
  def hash_transformations(hash, options)
7
- if options[:v2_mode]
8
- hash_transformations_v2(hash, options)
9
- else
10
- hash_transformations_v1(hash, options)
11
- end
12
- end
13
-
14
- def hash_transformations_v1(hash, options)
15
6
  # there may be unmapped keys, or keys purposedly mapped to nil or an empty key..
16
7
  # make sure we delete any key/value pairs from the hash, which the user wanted to delete:
17
8
  remove_empty_values = options[:remove_empty_values] == true
@@ -42,117 +33,46 @@ module SmarterCSV
42
33
  end
43
34
  end
44
35
 
45
- def hash_transformations_v2(hash, options)
46
- return hash if options[:hash_transformations].nil? || options[:hash_transformations].empty?
47
-
48
- # do the header transformations the user requested:
49
- if options[:hash_transformations]
50
- options[:hash_transformations].each do |transformation|
51
- if transformation.respond_to?(:call) # this is used when a user-provided Proc is passed in
52
- hash = transformation.call(hash, options)
53
- else
54
- case transformation
55
- when Symbol # this is used for pre-defined transformations that are defined in the SmarterCSV module
56
- hash = public_send(transformation, hash, options)
57
- when Hash # this is called for hash arguments, e.g. hash_transformations
58
- trans, args = transformation.first # .first treats the hash first element as an array
59
- hash = apply_transformation(trans, hash, args, options)
60
- when Array # this can be used for passing additional arguments in array form (e.g. into a Proc)
61
- trans, *args = transformation
62
- hash = apply_transformation(trans, hash, args, options)
63
- else
64
- raise SmarterCSV::IncorrectOption, "Invalid transformation type: #{transformation.class}"
65
- end
66
- end
67
- end
68
- end
69
-
70
- hash
71
- end
72
-
73
- #
74
- # To handle v1-backward-compatible behavior, it is faster to roll all behavior into one method
75
- #
76
- def v1_backwards_compatibility(hash, options)
77
- hash.each_with_object({}) do |(k, v), new_hash|
78
- next if k.nil? || k == '' || k == :"" # remove_empty_keys
79
- next if has_rails ? v.blank? : blank?(v) # remove_empty_values
80
-
81
- # convert_values_to_numeric:
82
- # deal with the :only / :except options to :convert_values_to_numeric
83
- unless limit_execution_for_only_or_except(options, :convert_values_to_numeric, k)
84
- if v =~ /^[+-]?\d+\.\d+$/
85
- v = v.to_f
86
- elsif v =~ /^[+-]?\d+$/
87
- v = v.to_i
88
- end
89
- end
90
-
91
- new_hash[k] = v
92
- end
93
- end
94
-
95
- #
96
- # Building Blocks in case you want to build your own flow:
97
- #
98
-
99
- def value_converters(hash, _options)
100
- #
101
- # TO BE IMPLEMENTED
102
- #
103
- end
104
-
105
- def strip_spaces(hash, _options)
106
- hash.each_key {|key| hash[key].strip! unless hash[key].nil? } # &. syntax was introduced in Ruby 2.3 - need to stay backwards compatible
107
- end
108
-
109
- def remove_blank_values(hash, _options)
110
- hash.each_key {|key| hash.delete(key) if hash[key].nil? || hash[key].is_a?(String) && hash[key] !~ /[^[:space:]]/ }
111
- end
112
-
113
- def remove_zero_values(hash, _options)
114
- hash.each_key {|key| hash.delete(key) if hash[key].is_a?(Numeric) && hash[key].zero? }
115
- end
116
-
117
- def remove_empty_keys(hash, _options)
118
- hash.reject!{|key, _v| key.nil? || key.empty?}
119
- end
120
-
121
- def convert_values_to_numeric(hash, _options)
122
- hash.each_key do |k|
123
- case hash[k]
124
- when /^[+-]?\d+\.\d+$/
125
- hash[k] = hash[k].to_f
126
- when /^[+-]?\d+$/
127
- hash[k] = hash[k].to_i
128
- end
129
- end
130
- end
131
-
132
- def convert_values_to_numeric_unless_leading_zeroes(hash, _options)
133
- hash.each_key do |k|
134
- case hash[k]
135
- when /^[+-]?[1-9]\d*\.\d+$/
136
- hash[k] = hash[k].to_f
137
- when /^[+-]?[1-9]\d*$/
138
- hash[k] = hash[k].to_i
139
- end
140
- end
141
- end
142
-
143
- # IMPORTANT NOTE:
144
- # this can lead to cases where a nil or empty value gets converted into 0 or 0.0,
145
- # and can then not be properly removed!
146
- #
147
- # you should first try to use convert_values_to_numeric or convert_values_to_numeric_unless_leading_zeroes
148
- #
149
- def convert_to_integer(hash, _options)
150
- hash.each_key {|key| hash[key] = hash[key].to_i }
151
- end
152
-
153
- def convert_to_float(hash, _options)
154
- hash.each_key {|key| hash[key] = hash[key].to_f }
155
- end
36
+ # def hash_transformations(hash, options)
37
+ # # there may be unmapped keys, or keys purposedly mapped to nil or an empty key..
38
+ # # make sure we delete any key/value pairs from the hash, which the user wanted to delete:
39
+ # hash.delete(nil)
40
+ # hash.delete('')
41
+ # hash.delete(:"")
42
+
43
+ # if options[:remove_empty_values] == true
44
+ # hash.delete_if{|_k, v| has_rails ? v.blank? : blank?(v)}
45
+ # end
46
+
47
+ # hash.delete_if{|_k, v| !v.nil? && v =~ /^(0+|0+\.0+)$/} if options[:remove_zero_values] # values are Strings
48
+ # hash.delete_if{|_k, v| v =~ options[:remove_values_matching]} if options[:remove_values_matching]
49
+
50
+ # if options[:convert_values_to_numeric]
51
+ # hash.each do |k, v|
52
+ # # deal with the :only / :except options to :convert_values_to_numeric
53
+ # next if limit_execution_for_only_or_except(options, :convert_values_to_numeric, k)
54
+
55
+ # # convert if it's a numeric value:
56
+ # case v
57
+ # when /^[+-]?\d+\.\d+$/
58
+ # hash[k] = v.to_f
59
+ # when /^[+-]?\d+$/
60
+ # hash[k] = v.to_i
61
+ # end
62
+ # end
63
+ # end
64
+
65
+ # if options[:value_converters]
66
+ # hash.each do |k, v|
67
+ # converter = options[:value_converters][k]
68
+ # next unless converter
69
+
70
+ # hash[k] = converter.convert(v)
71
+ # end
72
+ # end
73
+
74
+ # hash
75
+ # end
156
76
 
157
77
  protected
158
78
 
@@ -2,18 +2,8 @@
2
2
 
3
3
  module SmarterCSV
4
4
  class << self
5
- # this is processing the headers from the input file
5
+ # transform the headers that were in the file:
6
6
  def header_transformations(header_array, options)
7
- if options[:v2_mode]
8
- header_transformations_v2(header_array, options)
9
- else
10
- header_transformations_v1(header_array, options)
11
- end
12
- end
13
-
14
- # ---- V1.x Version: transform the headers that were in the file: ------------------------------------------
15
- #
16
- def header_transformations_v1(header_array, options)
17
7
  header_array.map!{|x| x.gsub(%r/#{options[:quote_char]}/, '')}
18
8
  header_array.map!{|x| x.strip} if options[:strip_whitespace]
19
9
 
@@ -67,99 +57,7 @@ module SmarterCSV
67
57
  header
68
58
  end
69
59
  end
70
-
71
60
  headers
72
61
  end
73
-
74
- # ---- V2.x Version: transform the headers that were in the file: ------------------------------------------
75
- #
76
- def header_transformations_v2(header_array, options)
77
- return header_array if options[:header_transformations].nil? || options[:header_transformations].empty?
78
-
79
- # do the header transformations the user requested:
80
- if options[:header_transformations]
81
- options[:header_transformations].each do |transformation|
82
- if transformation.respond_to?(:call) # this is used when a user-provided Proc is passed in
83
- header_array = transformation.call(header_array, options)
84
- else
85
- case transformation
86
- when Symbol # this is used for pre-defined transformations that are defined in the SmarterCSV module
87
- header_array = public_send(transformation, header_array, options)
88
- when Hash # this is called for hash arguments, e.g. header_transformations
89
- trans, args = transformation.first # .first treats the hash first element as an array
90
- header_array = apply_transformation(trans, header_array, args, options)
91
- when Array # this can be used for passing additional arguments in array form (e.g. into a Proc)
92
- trans, *args = transformation
93
- header_array = apply_transformation(trans, header_array, args, options)
94
- else
95
- raise SmarterCSV::IncorrectOption, "Invalid transformation type: #{transformation.class}"
96
- end
97
- end
98
- end
99
- end
100
-
101
- header_array
102
- end
103
-
104
- def apply_transformation(transformation, header_array, args, options)
105
- if transformation.respond_to?(:call)
106
- # If transformation is a callable object (like a Proc)
107
- transformation.call(header_array, args, options)
108
- else
109
- # If transformation is a symbol (method name)
110
- public_send(transformation, header_array, args, options)
111
- end
112
- end
113
-
114
- # pre-defined v2 header transformations:
115
-
116
- # these are some pre-defined header transformations which can be used
117
- # all these take the headers array as the input
118
- #
119
- # the computed options can be accessed via @options
120
-
121
- def keys_as_symbols(headers, options)
122
- headers.map do |header|
123
- header.strip.downcase.gsub(%r{#{options[:quote_char]}}, '').gsub(/(\s|-)+/, '_').to_sym
124
- end
125
- end
126
-
127
- def keys_as_strings(headers, options)
128
- headers.map do |header|
129
- header.strip.gsub(%r{#{options[:quote_char]}}, '').downcase.gsub(/(\s|-)+/, '_')
130
- end
131
- end
132
-
133
- def downcase_headers(headers, _options)
134
- headers.map do |header|
135
- header.strip.downcase!
136
- end
137
- end
138
-
139
- def key_mapping(headers, mapping = {}, options)
140
- raise(SmarterCSV::IncorrectOption, "ERROR: incorrect format for key_mapping! Expecting hash with from -> to mappings") if mapping.empty? || !mapping.is_a?(Hash)
141
-
142
- headers_set = headers.to_set
143
- mapping_keys_set = mapping.keys.to_set
144
- silence_keys_set = (options[:silence_missing_keys] || []).to_set
145
-
146
- # Check for missing keys
147
- missing_keys = mapping_keys_set - headers_set - silence_keys_set
148
- raise SmarterCSV::KeyMappingError, "ERROR: cannot map headers: #{missing_keys.to_a.join(', ')}" if missing_keys.any? && !options[:silence_missing_keys]
149
-
150
- # Apply key mapping, retaining nils for explicitly mapped headers
151
- headers.map do |header|
152
- if mapping.key?(header)
153
- # Maps the key according to the mapping, including nil mapping
154
- mapping[header]
155
- elsif options[:remove_unmapped_keys]
156
- # Remove headers not specified in the mapping
157
- nil
158
- else
159
- # Keep the original header if not specified in the mapping
160
- header
161
- end
162
- end
163
- end
164
62
  end
165
63
  end
@@ -3,21 +3,11 @@
3
3
  module SmarterCSV
4
4
  class << self
5
5
  def header_validations(headers, options)
6
- if options[:v2_mode]
7
- header_validations_v2(headers, options)
8
- else
9
- header_validations_v1(headers, options)
10
- end
11
- end
12
-
13
- # ---- V1.x Version: validate the headers -----------------------------------------------------------------
14
-
15
- def header_validations_v1(headers, options)
16
- check_duplicate_headers_v1(headers, options)
17
- check_required_headers_v1(headers, options)
6
+ check_duplicate_headers(headers, options)
7
+ check_required_headers(headers, options)
18
8
  end
19
9
 
20
- def check_duplicate_headers_v1(headers, _options)
10
+ def check_duplicate_headers(headers, _options)
21
11
  header_counts = Hash.new(0)
22
12
  headers.each { |header| header_counts[header] += 1 unless header.nil? }
23
13
 
@@ -28,109 +18,17 @@ module SmarterCSV
28
18
  end
29
19
  end
30
20
 
31
- def check_required_headers_v1(headers, options)
21
+ require 'set'
22
+
23
+ def check_required_headers(headers, options)
32
24
  if options[:required_keys] && options[:required_keys].is_a?(Array)
33
25
  headers_set = headers.to_set
34
26
  missing_keys = options[:required_keys].select { |k| !headers_set.include?(k) }
35
27
 
36
28
  unless missing_keys.empty?
37
- raise SmarterCSV::MissingKeys, "ERROR: missing attributes: #{missing_keys.join(',')}"
29
+ raise SmarterCSV::MissingKeys, "ERROR: missing attributes: #{missing_keys.join(',')}. Check `SmarterCSV.headers` for original headers."
38
30
  end
39
31
  end
40
32
  end
41
-
42
- # ---- V2.x Version: validate the headers -----------------------------------------------------------------
43
-
44
- # def header_validations_v2(headers, options)
45
- # return unless options[:header_validations]
46
-
47
- # options[:header_validations].each do |validation|
48
- # if validation.respond_to?(:call)
49
- # # Directly call if it's a Proc or lambda
50
- # validation.call(headers)
51
- # else
52
- # binding.pry
53
- # # Handle Symbol, Hash, or Array
54
- # method_name, args = validation.is_a?(Symbol) ? [validation, []] : validation
55
- # public_send(method_name, headers, *Array(args))
56
- # end
57
- # end
58
- # end
59
-
60
- def header_validations_v2(headers, options)
61
- return unless options[:header_validations]
62
-
63
- # do the header validations the user requested:
64
- # Header validations typically raise errors directly
65
- #
66
- options[:header_validations].each do |validation|
67
- if validation.respond_to?(:call)
68
- # Directly call if it's a Proc or lambda
69
- validation.call(headers)
70
- else
71
- case validation
72
- when Symbol
73
- public_send(validation, headers)
74
- when Hash
75
- val, args = validation.first
76
- public_send(val, headers, args)
77
- when Array
78
- val, *args = validation
79
- public_send(val, headers, args)
80
- else
81
- raise SmarterCSV::IncorrectOption, "Invalid validation type: #{validation.class}"
82
- end
83
- end
84
- end
85
- end
86
-
87
- # def header_validations_v2_orig(headers, options)
88
- # # do the header validations the user requested:
89
- # # Header validations typically raise errors directly
90
- # #
91
- # if options[:header_validations]
92
- # options[:header_validations].each do |validation|
93
- # case validation
94
- # when Symbol
95
- # public_send(validation, headers)
96
- # when Hash
97
- # val, args = validation.first
98
- # public_send(val, headers, args)
99
- # when Array
100
- # val, args = validation
101
- # public_send(val, headers, args)
102
- # else
103
- # validation.call(headers) unless validation.nil?
104
- # end
105
- # end
106
- # end
107
- # end
108
-
109
- # these are some pre-defined header validations which can be used
110
- # all these take the headers array as the input
111
- #
112
- # the computed options can be accessed via @options
113
-
114
- def unique_headers(headers)
115
- header_counts = Hash.new(0)
116
- headers.each { |header| header_counts[header] += 1 unless header.nil? }
117
-
118
- duplicates = header_counts.select { |_, count| count > 1 }
119
-
120
- unless duplicates.empty?
121
- raise(SmarterCSV::DuplicateHeaders, "Duplicate Headers in CSV: #{duplicates.inspect}")
122
- end
123
- end
124
-
125
- def required_headers(headers, required = [])
126
- raise(SmarterCSV::IncorrectOption, "ERROR: required_headers validation needs an array argument") unless required.is_a?(Array)
127
-
128
- headers_set = headers.to_set
129
- missing = required.select { |r| !headers_set.include?(r) }
130
-
131
- unless missing.empty?
132
- raise(SmarterCSV::MissingKeys, "Missing Headers in CSV: #{missing.inspect}")
133
- end
134
- end
135
33
  end
136
34
  end
@@ -1,7 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module SmarterCSV
4
- COMMON_OPTIONS = {
4
+ DEFAULT_OPTIONS = {
5
5
  acceleration: true,
6
6
  auto_row_sep_chars: 500,
7
7
  chunk_size: nil,
@@ -15,66 +15,39 @@ module SmarterCSV
15
15
  force_utf8: false,
16
16
  headers_in_file: true,
17
17
  invalid_byte_sequence: '',
18
- quote_char: '"',
19
- remove_unmapped_keys: false,
20
- row_sep: :auto, # was: $/,
21
- silence_deprecations: false, # new in 1.11
22
- silence_missing_keys: false,
23
- skip_lines: nil,
24
- user_provided_headers: nil,
25
- verbose: false,
26
- with_line_numbers: false,
27
- v2_mode: false,
28
- }.freeze
29
-
30
- V1_DEFAULT_OPTIONS = {
31
18
  keep_original_headers: false,
32
19
  key_mapping: nil,
20
+ quote_char: '"',
33
21
  remove_empty_hashes: true,
34
22
  remove_empty_values: true,
23
+ remove_unmapped_keys: false,
35
24
  remove_values_matching: nil,
36
25
  remove_zero_values: false,
37
26
  required_headers: nil,
38
27
  required_keys: nil,
28
+ row_sep: :auto, # was: $/,
29
+ silence_missing_keys: false,
30
+ skip_lines: nil,
39
31
  strings_as_keys: false,
40
32
  strip_chars_from_headers: nil,
41
33
  strip_whitespace: true,
34
+ user_provided_headers: nil,
42
35
  value_converters: nil,
43
- v2_mode: false,
36
+ verbose: false,
37
+ with_line_numbers: false,
44
38
  }.freeze
45
39
 
46
- DEPRECATED_OPTIONS = [
47
- :convert_values_to_numeric,
48
- :downcase_headers,
49
- :keep_original_headers,
50
- :key_mapping,
51
- :remove_empty_hashes,
52
- :remove_empty_values,
53
- :remove_values_matching,
54
- :remove_zero_values,
55
- :required_headers,
56
- :required_keys,
57
- :stirngs_as_keys,
58
- :strip_cars_from_headers,
59
- :strip_whitespace,
60
- :value_converters,
61
- ].freeze
62
-
63
40
  class << self
64
41
  # NOTE: this is not called when "parse" methods are tested by themselves
65
42
  def process_options(given_options = {})
66
43
  puts "User provided options:\n#{pp(given_options)}\n" if given_options[:verbose]
67
44
 
68
- # fix invalid input
69
- given_options[:invalid_byte_sequence] = '' if given_options[:invalid_byte_sequence].nil?
70
-
71
- # warn about deprecated options / raises error for v2_mode
72
- handle_deprecations(given_options)
45
+ @options = DEFAULT_OPTIONS.dup.merge!(given_options)
73
46
 
74
- given_options = preprocess_v2_options(given_options) if given_options[:v2_mode]
47
+ # fix invalid input
48
+ @options[:invalid_byte_sequence] ||= ''
75
49
 
76
- @options = compute_default_options(given_options).merge!(given_options)
77
- puts "Computed options:\n#{pp(@options)}\n" if given_options[:verbose]
50
+ puts "Computed options:\n#{pp(@options)}\n" if @options[:verbose]
78
51
 
79
52
  validate_options!(@options)
80
53
  @options
@@ -84,35 +57,11 @@ module SmarterCSV
84
57
  #
85
58
  # ONLY FOR BACKWARDS-COMPATIBILITY
86
59
  def default_options
87
- COMMON_OPTIONS.merge(V1_DEFAULT_OPTIONS)
60
+ DEFAULT_OPTIONS
88
61
  end
89
62
 
90
63
  private
91
64
 
92
- def compute_default_options(options = {})
93
- return COMMON_OPTIONS.merge(V1_DEFAULT_OPTIONS) unless options[:v2_mode]
94
-
95
- default_options = {}
96
- if options[:defaults].to_s != 'none'
97
- default_options = COMMON_OPTIONS.dup.merge(V2_DEFAULT_OPTIONS)
98
- if options[:defaults].to_s == 'v1'
99
- default_options.merge(V1_TRANSFORMATIONS)
100
- else
101
- default_options.merge(V2_TRANSFORMATIONS)
102
- end
103
- end
104
- end
105
-
106
- def handle_deprecations(options)
107
- used_deprecated_options = DEPRECATED_OPTIONS & options.keys
108
- message = "SmarterCSV #{VERSION} DEPRECATED OPTIONS: #{pp(used_deprecated_options)}"
109
- if options[:v2_mode]
110
- raise(SmarterCSV::DeprecatedOptions, "ERROR: #{message}") unless used_deprecated_options.empty? || options[:silence_deprecations]
111
- else
112
- puts "DEPRECATION WARNING: #{message}" unless used_deprecated_options.empty? || options[:silence_deprecations]
113
- end
114
- end
115
-
116
65
  def validate_options!(options)
117
66
  # deprecate required_headers
118
67
  unless options[:required_headers].nil?
@@ -141,57 +90,5 @@ module SmarterCSV
141
90
  def pp(value)
142
91
  defined?(AwesomePrint) ? value.awesome_inspect(index: nil) : value.inspect
143
92
  end
144
-
145
- # ---- V2 code ----------------------------------------------------------------------------------------
146
-
147
- V2_DEFAULT_OPTIONS = {
148
- # These need to go to the COMMON_OPTIONS:
149
- remove_empty_hashes: true, # this might need a transformation or move to common options
150
- # ------------
151
- header_transformations: [:keys_as_symbols],
152
- header_validations: [:unique_headers],
153
- # data_transformations: [:replace_blank_with_nil],
154
- # data_validations: [],
155
- hash_transformations: [:strip_spaces, :remove_blank_values],
156
- hash_validations: [],
157
- v2_mode: true,
158
- }.freeze
159
-
160
- V2_TRANSFORMATIONS = {
161
- header_transformations: [:keys_as_symbols],
162
- header_validations: [:unique_headers],
163
- # data_transformations: [:replace_blank_with_nil],
164
- # data_validations: [],
165
- hash_transformations: [:v1_backwards_compatibility],
166
- # hash_transformations: [:remove_empty_keys, :strip_spaces, :remove_blank_values, :convert_values_to_numeric], # ??? :convert_values_to_numeric]
167
- hash_validations: [],
168
- }.freeze
169
-
170
- V1_TRANSFORMATIONS = {
171
- header_transformations: [:keys_as_symbols],
172
- header_validations: [:unique_headers],
173
- # data_transformations: [:replace_blank_with_nil],
174
- # data_validations: [],
175
- hash_transformations: [:strip_spaces, :remove_blank_values, :convert_values_to_numeric],
176
- hash_validations: [],
177
- }.freeze
178
-
179
- def preprocess_v2_options(options)
180
- return options unless options[:v2_mode] || options[:header_transformations]
181
-
182
- # We want to provide safe defaults for easy processing, that is why we have a special keyword :none
183
- # to not do any header transformations..
184
- #
185
- # this is why we need to remove the 'none' here:
186
- #
187
- requested_header_transformations = options[:header_transformations]
188
- if requested_header_transformations.to_s == 'none'
189
- requested_header_transformations = []
190
- else
191
- requested_header_transformations = requested_header_transformations.reject {|x| x.to_s == 'none'} unless requested_header_transformations.nil?
192
- end
193
- options[:header_transformations] = requested_header_transformations || []
194
- options
195
- end
196
93
  end
197
94
  end
@@ -2,7 +2,6 @@
2
2
 
3
3
  module SmarterCSV
4
4
  class SmarterCSVException < StandardError; end
5
- class DeprecatedOptions < SmarterCSVException; end
6
5
  class HeaderSizeMismatch < SmarterCSVException; end
7
6
  class IncorrectOption < SmarterCSVException; end
8
7
  class ValidationError < SmarterCSVException; end
@@ -109,10 +108,6 @@ module SmarterCSV
109
108
 
110
109
  next if options[:remove_empty_hashes] && hash.empty?
111
110
 
112
- #
113
- # should HASH VALIDATIONS go here instead?
114
- #
115
-
116
111
  puts "CSV Line #{@file_line_count}: #{pp(hash)}" if @verbose == '2' # very verbose setting
117
112
  # optional adding of csv_line_number to the hash to help debugging
118
113
  hash[:csv_line_number] = @csv_line_count if options[:with_line_numbers]
@@ -170,19 +165,22 @@ module SmarterCSV
170
165
  end
171
166
 
172
167
  class << self
173
- # Counts the number of quote characters in a line, excluding escaped quotes.
174
- # FYI: using Ruby built-in regex processing to determine the number of quotes
175
168
  def count_quote_chars(line, quote_char)
176
169
  return 0 if line.nil? || quote_char.nil? || quote_char.empty?
177
170
 
178
- # Escaped quote character (e.g., if quote_char is ", then escaped is \")
179
- escaped_quote = Regexp.escape(quote_char)
171
+ count = 0
172
+ escaped = false
180
173
 
181
- # Pattern to match a quote character not preceded by a backslash
182
- pattern = /(?<!\\)(?:\\\\)*#{escaped_quote}/
174
+ line.each_char do |char|
175
+ if char == '\\' && !escaped
176
+ escaped = true
177
+ else
178
+ count += 1 if char == quote_char && !escaped
179
+ escaped = false
180
+ end
181
+ end
183
182
 
184
- # Count occurrences
185
- line.scan(pattern).count
183
+ count
186
184
  end
187
185
 
188
186
  def has_acceleration?
@@ -15,7 +15,6 @@ module SmarterCSV
15
15
  @raw_header = nil # header as it appears in the file
16
16
  @result = []
17
17
  @warnings = {}
18
- @v2_mode = false
19
18
  @enforce_utf8 = false # only set to true if needed (after options parsing)
20
19
  end
21
20
 
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module SmarterCSV
4
- VERSION = "1.11.0.pre2"
4
+ VERSION = "1.11.0"
5
5
  end
@@ -0,0 +1,102 @@
1
+ # frozen_string_literal: true
2
+
3
+ module SmarterCSV
4
+ #
5
+ # Generate CSV files
6
+ #
7
+ # Create an instance of the Writer class with the filename and options.
8
+ # call `<<` one or mulltiple times to append data to the file.
9
+ # call `finalize` to save the file.
10
+ #
11
+ # The `<<` method can take different arguments:
12
+ # * a signle Hash
13
+ # * an array of Hashes
14
+ # * nested arrays of arrays of Hashes
15
+ #
16
+ # By default SmarterCSV::Writer automatically discovers all headers that are present
17
+ # in the data on-the-fly. This can be disabled, then only given headers are used.
18
+ # Disabling can be useful when you want to select attributes from hashes, or ActiveRecord instances.
19
+ #
20
+ # If `discover_headers` is enabled, and headers are given, any new headers that are found in the data will still be appended.
21
+ #
22
+ # The Writer automatically quotes fields containing the col_sep, row_sep, or the quote_char.
23
+ #
24
+ # Options:
25
+ # col_sep : defaults to , but can be set to any other character
26
+ # row_sep : defaults to LF \n , but can be set to \r\n or \r or anything else
27
+ # quote_char : defaults to "
28
+ # discover_headers : defaults to true
29
+ # headers : defaults to []
30
+ # force_quotes: defaults to false
31
+ # map_headers: defaults to {}, can be a hash of key -> value mappings
32
+
33
+ # IMPORTANT NOTES:
34
+ # * Data hashes could contain strings or symbols as keys.
35
+ # Make sure to use the correct form when specifying headers manually,
36
+ # in combination with the :discover_headers option
37
+
38
+ class Writer
39
+ def initialize(file_path, options = {})
40
+ @options = options
41
+ @discover_headers = options.has_key?(:discover_headers) ? (options[:discover_headers] == true) : true
42
+ @headers = options[:headers] || []
43
+ @row_sep = options[:row_sep] || "\n" # RFC4180 "\r\n"
44
+ @col_sep = options[:col_sep] || ','
45
+ @quote_char = '"'
46
+ @force_quotes = options[:force_quotes] == true
47
+ @map_headers = options[:map_headers] || {}
48
+ @output_file = File.open(file_path, 'w+')
49
+ # hidden state:
50
+ @temp_file = Tempfile.new('tempfile', '/tmp')
51
+ @quote_regex = Regexp.union(@col_sep, @row_sep, @quote_char)
52
+ end
53
+
54
+ def <<(data)
55
+ case data
56
+ when Hash
57
+ process_hash(data)
58
+ when Array
59
+ data.each { |item| self << item }
60
+ when NilClass
61
+ # ignore
62
+ else
63
+ raise ArgumentError, "Invalid data type: #{data.class}. Must be a Hash or an Array."
64
+ end
65
+ end
66
+
67
+ def finalize
68
+ # Map headers if :map_headers option is provided
69
+ mapped_headers = @headers.map { |header| @map_headers[header] || header }
70
+
71
+ @temp_file.rewind
72
+ @output_file.write(mapped_headers.join(@col_sep) + @row_sep)
73
+ @output_file.write(@temp_file.read)
74
+ @output_file.flush
75
+ @output_file.close
76
+ @temp_file.delete
77
+ end
78
+
79
+ private
80
+
81
+ def process_hash(hash)
82
+ if @discover_headers
83
+ hash_keys = hash.keys
84
+ new_keys = hash_keys - @headers
85
+ @headers.concat(new_keys)
86
+ end
87
+
88
+ # Reorder the hash to match the current headers order and fill missing fields
89
+ ordered_row = @headers.map { |header| hash[header] || '' }
90
+
91
+ @temp_file.write ordered_row.map { |value| escape_csv_field(value) }.join(@col_sep) + @row_sep
92
+ end
93
+
94
+ def escape_csv_field(field)
95
+ if @force_quotes || field.to_s.match(@quote_regex)
96
+ "\"#{field}\""
97
+ else
98
+ field.to_s
99
+ end
100
+ end
101
+ end
102
+ end
data/lib/smarter_csv.rb CHANGED
@@ -1,7 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
- require 'set'
4
-
5
3
  require "smarter_csv/version"
6
4
  require "smarter_csv/file_io"
7
5
  require "smarter_csv/options_processing"
@@ -12,6 +10,7 @@ require 'smarter_csv/header_validations'
12
10
  require "smarter_csv/headers"
13
11
  require "smarter_csv/hash_transformations"
14
12
  require "smarter_csv/parse"
13
+ require "smarter_csv/writer"
15
14
 
16
15
  # load the C-extension:
17
16
  case RUBY_ENGINE
data/smarter_csv.gemspec CHANGED
@@ -9,8 +9,8 @@ Gem::Specification.new do |spec|
9
9
  spec.authors = ["Tilo Sloboda"]
10
10
  spec.email = ["tilo.sloboda@gmail.com"]
11
11
 
12
- spec.summary = "Ruby Gem for smarter importing of CSV Files (and CSV-like files), with lots of optional features, e.g. chunked processing for huge CSV files"
13
- spec.description = "Ruby Gem for smarter importing of CSV Files as Array(s) of Hashes, with optional features for processing large files in parallel, embedded comments, unusual field- and record-separators, flexible mapping of CSV-headers to Hash-keys"
12
+ spec.summary = "CSV Reading and Writing"
13
+ spec.description = "Ruby Gem for smarter importing of CSV Files as Array(s) of Hashes, with lots of features for processing large files in parallel, embedded comments, unusual field- and record-separators, flexible mapping of CSV-headers to Hash-keys"
14
14
  spec.homepage = "https://github.com/tilo/smarter_csv"
15
15
  spec.license = 'MIT'
16
16
 
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: smarter_csv
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.11.0.pre2
4
+ version: 1.11.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Tilo Sloboda
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2024-01-14 00:00:00.000000000 Z
11
+ date: 2024-07-02 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: awesome_print
@@ -95,7 +95,7 @@ dependencies:
95
95
  - !ruby/object:Gem::Version
96
96
  version: '0'
97
97
  description: Ruby Gem for smarter importing of CSV Files as Array(s) of Hashes, with
98
- optional features for processing large files in parallel, embedded comments, unusual
98
+ lots of features for processing large files in parallel, embedded comments, unusual
99
99
  field- and record-separators, flexible mapping of CSV-headers to Hash-keys
100
100
  email:
101
101
  - tilo.sloboda@gmail.com
@@ -104,6 +104,7 @@ extensions:
104
104
  - ext/smarter_csv/extconf.rb
105
105
  extra_rdoc_files: []
106
106
  files:
107
+ - ".rspec"
107
108
  - ".rubocop.yml"
108
109
  - ".rvmrc"
109
110
  - CHANGELOG.md
@@ -127,6 +128,7 @@ files:
127
128
  - lib/smarter_csv/smarter_csv.rb
128
129
  - lib/smarter_csv/variables.rb
129
130
  - lib/smarter_csv/version.rb
131
+ - lib/smarter_csv/writer.rb
130
132
  - smarter_csv.gemspec
131
133
  homepage: https://github.com/tilo/smarter_csv
132
134
  licenses:
@@ -147,13 +149,12 @@ required_ruby_version: !ruby/object:Gem::Requirement
147
149
  version: 2.5.0
148
150
  required_rubygems_version: !ruby/object:Gem::Requirement
149
151
  requirements:
150
- - - ">"
152
+ - - ">="
151
153
  - !ruby/object:Gem::Version
152
- version: 1.3.1
154
+ version: '0'
153
155
  requirements: []
154
156
  rubygems_version: 3.2.3
155
157
  signing_key:
156
158
  specification_version: 4
157
- summary: Ruby Gem for smarter importing of CSV Files (and CSV-like files), with lots
158
- of optional features, e.g. chunked processing for huge CSV files
159
+ summary: CSV Reading and Writing
159
160
  test_files: []