smarter_csv 1.11.0.pre2 → 1.11.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 9027a37c4b29e68fcbc559a6a8285e5076684883612e98b13f116526dadc6e4b
4
- data.tar.gz: 43cfa0254ac2caa8ca02a8863fc790f1c56beb0b16e175a16fd947f92eda8c08
3
+ metadata.gz: 996d8195dc01722461990cb51dbc2b65d4cf2edf93eff2c1d4c5359614911b92
4
+ data.tar.gz: 0f193b450b4cfb97a1f11bbeacbc659b5220fa545751533a225347349943bc9a
5
5
  SHA512:
6
- metadata.gz: d5b2eac35e33bdeb9ec632578207c47f34a08a4595c1ebf04929a7fe302efc3fc08565c0d6fc6454fd538b99cc1c4a5662599a88693e3aa1b80c5c0c7fc1b05e
7
- data.tar.gz: 36f622f12d5412ef8919c30a267def315985aef4b3f33c209341a046442795bb66df6c8ebb97399f35e24fd565df658a0e4b7a94d3563831d7ce32facbeab33f
6
+ metadata.gz: 1a8141ccf75d5f6edf5ffc6f4b719f36dabfc826903b69804cfda09988f43fce24678e7e44fe53e0228a3077858fa870b920bf019b9dd996c53d9024364767ca
7
+ data.tar.gz: 25daf165cb30decb0f7cd5703758dde5879a12398e6826fb8e0def3efb135b33ab8ef2bf26149d0bd887d42d73a19f9af8ab5e7ea1aee737e139a3acf5da6a4f
data/.rspec ADDED
@@ -0,0 +1 @@
1
+ --require spec_helper
data/.rubocop.yml CHANGED
@@ -88,18 +88,12 @@ Style/IfInsideElse:
88
88
  Style/IfUnlessModifier:
89
89
  Enabled: false
90
90
 
91
- Style/InverseMethods:
92
- Enabled: false
93
-
94
91
  Style/NestedTernaryOperator:
95
92
  Enabled: false
96
93
 
97
94
  Style/PreferredHashMethods:
98
95
  Enabled: false
99
96
 
100
- Style/Proc:
101
- Enabled: false
102
-
103
97
  Style/NumericPredicate:
104
98
  Enabled: false
105
99
 
@@ -118,6 +112,9 @@ Style/SlicingWithRange:
118
112
  Style/SpecialGlobalVars: # DANGER: unsafe rule!!
119
113
  Enabled: false
120
114
 
115
+ Style/StringConcatenation:
116
+ Enabled: false
117
+
121
118
  Style/StringLiterals:
122
119
  Enabled: false
123
120
  EnforcedStyle: double_quotes
@@ -135,9 +132,6 @@ Style/SymbolProc: # old Ruby versions can't do this
135
132
  Style/TrailingCommaInHashLiteral:
136
133
  Enabled: false
137
134
 
138
- Style/TrailingCommaInArrayLiteral:
139
- Enabled: false
140
-
141
135
  Style/TrailingUnderscoreVariable:
142
136
  Enabled: false
143
137
 
@@ -147,9 +141,6 @@ Style/TrivialAccessors:
147
141
  # Style/UnlessModifier:
148
142
  # Enabled: false
149
143
 
150
- Style/WordArray:
151
- Enabled: false
152
-
153
144
  Style/ZeroLengthPredicate:
154
145
  Enabled: false
155
146
 
data/CHANGELOG.md CHANGED
@@ -1,42 +1,16 @@
1
1
 
2
2
  # SmarterCSV 1.x Change Log
3
3
 
4
- ## T.B.D.
5
-
6
- * code refactor
4
+ ## 1.11.0 (2024-07-02)
5
+ * added SmarterCSV::Writer to output CSV files ([issue #44](https://github.com/tilo/smarter_csv/issues/44))
7
6
 
8
- * NEW BEHAVIOR:
9
- - hidden `:v2_mode` options (incomplete!)
10
- - pre-processing for v2 options
11
- - implemented v2 `:header_transformations` (DO NOT USE YET!)
12
- + -> check if all v1 transformations are correctly done
13
- How are we going to
14
- * disambiguate headers?
15
-
16
-
17
- * do key_mapping? -> seems to work
18
- - remove_unmapped_keys ?
19
- - silence missing keys ... a missing mapped key should raise an exception, except when silenced
20
- - required_keys needs to be a header-validation
21
-
7
+ ## 1.10.3 (2024-03-10)
8
+ * fixed issue when frozen options are handed in (thanks to Daniel Pepper)
9
+ * cleaned-up rspec tests (thanks to Daniel Pepper)
10
+ * fixed link in README (issue #251)
22
11
 
23
- * keep original headers? -> :none
24
- * do strings_as_* ? -> either :keys_as_symbols, :keys_as_strings
25
- * remove quote_chars? -> included in keys_as_*
26
- * strip whitespace? -> included in keys_as_*
27
-
28
- TODO:
29
-
30
- - add tests for header_validations
31
-
32
- - modify options to handle v1 and v2 options
33
- - add v1 defaults in v2 processing
34
- - add tests for all options processing
35
- - 100% backwards compatibility when working in v1 mode
36
-
37
-
38
- ## 1.10.1 (2024-01-07)
39
- * fix incorrect warning about UTF-8 (issue #268, thanks hirowatari)
12
+ ## 1.10.2 (2024-02-11)
13
+ * improve error message for missing keys
40
14
 
41
15
  ## 1.10.1 (2024-01-07)
42
16
  * fix incorrect warning about UTF-8 (issue #268, thanks hirowatari)
data/CONTRIBUTORS.md CHANGED
@@ -51,4 +51,5 @@ A Big Thank you to everyone who filed issues, sent comments, and who contributed
51
51
  * [Rahul Chaudhary](https://github.com/rahulch95)
52
52
  * [Alessandro Fazzi](https://github.com/pioneerskies)
53
53
  * [JP Camara](https://github.com/jpcamara)
54
- * [Hiro Watari](https://github.com/hirowatari)
54
+ * [Kenton Hirowatari](https://github.com/hirowatari)
55
+ * [Daniel Pepper](https://github.com/dpep)
data/README.md CHANGED
@@ -3,10 +3,11 @@
3
3
 
4
4
  [![codecov](https://codecov.io/gh/tilo/smarter_csv/branch/main/graph/badge.svg?token=1L7OD80182)](https://codecov.io/gh/tilo/smarter_csv) [![Gem Version](https://badge.fury.io/rb/smarter_csv.svg)](http://badge.fury.io/rb/smarter_csv)
5
5
 
6
+ This library provides a complete interface to CSV files and data. It offers tools to enable you to read and write to and from Strings or IO objects, as needed.
6
7
 
7
- #### LATEST CHANGES
8
+ #### BREAKING CHANGES
8
9
 
9
- * Version 1.10.0 has BREAKING CHANGES:
10
+ * Version 1.10.0 had BREAKING CHANGES:
10
11
 
11
12
  Changed behavior:
12
13
  + when `user_provided_headers` are provided:
@@ -23,13 +24,8 @@
23
24
 
24
25
  * default branch is `main` for 1.x development
25
26
 
26
- * 2.x development is on `2.0-development` (check this branch for 2.0 documentation)
27
- - This is an EXPERIMENTAL branch - DO NOT USE in production
28
-
29
- #### Work towards Future Version 2.x
30
-
31
- * Work towards SmarterCSV 2.x is still ongoing, with improved features, and more streamlined options, but consider it as experimental at this time.
32
- Please check the [2.0-develop branch](https://github.com/tilo/smarter_csv/tree/2.0-develop), open any issues and pull requests with mention of tag v2.0.
27
+ * 2.x development is [MOVED TO THIS PR](https://github.com/tilo/smarter_csv/pull/267)
28
+ - 2.x behavior is still EXPERIMENTAL - DO NOT USE in production
33
29
 
34
30
  ---------------
35
31
 
@@ -394,10 +390,9 @@ And header and data validations will also be supported in 2.x
394
390
  * some CSV files use un-escaped quotation characters inside fields. This can cause the import to break. To get around this, use the `:force_simple_split => true` option in combination with `:strip_chars_from_headers => /[\-"]/` . This will also significantly speed up the import.
395
391
  If you would force a different :quote_char instead (setting it to a non-used character), then the import would be up to 5-times slower than using `:force_simple_split`.
396
392
 
397
- ## See also:
398
-
399
- http://www.unixgods.org/~tilo/Ruby/process_csv_as_hashes.html
393
+ ## The original post that started SmarterCSV:
400
394
 
395
+ http://www.unixgods.org/Ruby/process_csv_as_hashes.html
401
396
 
402
397
 
403
398
  ## Installation
@@ -2,16 +2,7 @@
2
2
 
3
3
  module SmarterCSV
4
4
  class << self
5
- # this is processing the headers from the input file
6
5
  def hash_transformations(hash, options)
7
- if options[:v2_mode]
8
- hash_transformations_v2(hash, options)
9
- else
10
- hash_transformations_v1(hash, options)
11
- end
12
- end
13
-
14
- def hash_transformations_v1(hash, options)
15
6
  # there may be unmapped keys, or keys purposedly mapped to nil or an empty key..
16
7
  # make sure we delete any key/value pairs from the hash, which the user wanted to delete:
17
8
  remove_empty_values = options[:remove_empty_values] == true
@@ -42,117 +33,46 @@ module SmarterCSV
42
33
  end
43
34
  end
44
35
 
45
- def hash_transformations_v2(hash, options)
46
- return hash if options[:hash_transformations].nil? || options[:hash_transformations].empty?
47
-
48
- # do the header transformations the user requested:
49
- if options[:hash_transformations]
50
- options[:hash_transformations].each do |transformation|
51
- if transformation.respond_to?(:call) # this is used when a user-provided Proc is passed in
52
- hash = transformation.call(hash, options)
53
- else
54
- case transformation
55
- when Symbol # this is used for pre-defined transformations that are defined in the SmarterCSV module
56
- hash = public_send(transformation, hash, options)
57
- when Hash # this is called for hash arguments, e.g. hash_transformations
58
- trans, args = transformation.first # .first treats the hash first element as an array
59
- hash = apply_transformation(trans, hash, args, options)
60
- when Array # this can be used for passing additional arguments in array form (e.g. into a Proc)
61
- trans, *args = transformation
62
- hash = apply_transformation(trans, hash, args, options)
63
- else
64
- raise SmarterCSV::IncorrectOption, "Invalid transformation type: #{transformation.class}"
65
- end
66
- end
67
- end
68
- end
69
-
70
- hash
71
- end
72
-
73
- #
74
- # To handle v1-backward-compatible behavior, it is faster to roll all behavior into one method
75
- #
76
- def v1_backwards_compatibility(hash, options)
77
- hash.each_with_object({}) do |(k, v), new_hash|
78
- next if k.nil? || k == '' || k == :"" # remove_empty_keys
79
- next if has_rails ? v.blank? : blank?(v) # remove_empty_values
80
-
81
- # convert_values_to_numeric:
82
- # deal with the :only / :except options to :convert_values_to_numeric
83
- unless limit_execution_for_only_or_except(options, :convert_values_to_numeric, k)
84
- if v =~ /^[+-]?\d+\.\d+$/
85
- v = v.to_f
86
- elsif v =~ /^[+-]?\d+$/
87
- v = v.to_i
88
- end
89
- end
90
-
91
- new_hash[k] = v
92
- end
93
- end
94
-
95
- #
96
- # Building Blocks in case you want to build your own flow:
97
- #
98
-
99
- def value_converters(hash, _options)
100
- #
101
- # TO BE IMPLEMENTED
102
- #
103
- end
104
-
105
- def strip_spaces(hash, _options)
106
- hash.each_key {|key| hash[key].strip! unless hash[key].nil? } # &. syntax was introduced in Ruby 2.3 - need to stay backwards compatible
107
- end
108
-
109
- def remove_blank_values(hash, _options)
110
- hash.each_key {|key| hash.delete(key) if hash[key].nil? || hash[key].is_a?(String) && hash[key] !~ /[^[:space:]]/ }
111
- end
112
-
113
- def remove_zero_values(hash, _options)
114
- hash.each_key {|key| hash.delete(key) if hash[key].is_a?(Numeric) && hash[key].zero? }
115
- end
116
-
117
- def remove_empty_keys(hash, _options)
118
- hash.reject!{|key, _v| key.nil? || key.empty?}
119
- end
120
-
121
- def convert_values_to_numeric(hash, _options)
122
- hash.each_key do |k|
123
- case hash[k]
124
- when /^[+-]?\d+\.\d+$/
125
- hash[k] = hash[k].to_f
126
- when /^[+-]?\d+$/
127
- hash[k] = hash[k].to_i
128
- end
129
- end
130
- end
131
-
132
- def convert_values_to_numeric_unless_leading_zeroes(hash, _options)
133
- hash.each_key do |k|
134
- case hash[k]
135
- when /^[+-]?[1-9]\d*\.\d+$/
136
- hash[k] = hash[k].to_f
137
- when /^[+-]?[1-9]\d*$/
138
- hash[k] = hash[k].to_i
139
- end
140
- end
141
- end
142
-
143
- # IMPORTANT NOTE:
144
- # this can lead to cases where a nil or empty value gets converted into 0 or 0.0,
145
- # and can then not be properly removed!
146
- #
147
- # you should first try to use convert_values_to_numeric or convert_values_to_numeric_unless_leading_zeroes
148
- #
149
- def convert_to_integer(hash, _options)
150
- hash.each_key {|key| hash[key] = hash[key].to_i }
151
- end
152
-
153
- def convert_to_float(hash, _options)
154
- hash.each_key {|key| hash[key] = hash[key].to_f }
155
- end
36
+ # def hash_transformations(hash, options)
37
+ # # there may be unmapped keys, or keys purposedly mapped to nil or an empty key..
38
+ # # make sure we delete any key/value pairs from the hash, which the user wanted to delete:
39
+ # hash.delete(nil)
40
+ # hash.delete('')
41
+ # hash.delete(:"")
42
+
43
+ # if options[:remove_empty_values] == true
44
+ # hash.delete_if{|_k, v| has_rails ? v.blank? : blank?(v)}
45
+ # end
46
+
47
+ # hash.delete_if{|_k, v| !v.nil? && v =~ /^(0+|0+\.0+)$/} if options[:remove_zero_values] # values are Strings
48
+ # hash.delete_if{|_k, v| v =~ options[:remove_values_matching]} if options[:remove_values_matching]
49
+
50
+ # if options[:convert_values_to_numeric]
51
+ # hash.each do |k, v|
52
+ # # deal with the :only / :except options to :convert_values_to_numeric
53
+ # next if limit_execution_for_only_or_except(options, :convert_values_to_numeric, k)
54
+
55
+ # # convert if it's a numeric value:
56
+ # case v
57
+ # when /^[+-]?\d+\.\d+$/
58
+ # hash[k] = v.to_f
59
+ # when /^[+-]?\d+$/
60
+ # hash[k] = v.to_i
61
+ # end
62
+ # end
63
+ # end
64
+
65
+ # if options[:value_converters]
66
+ # hash.each do |k, v|
67
+ # converter = options[:value_converters][k]
68
+ # next unless converter
69
+
70
+ # hash[k] = converter.convert(v)
71
+ # end
72
+ # end
73
+
74
+ # hash
75
+ # end
156
76
 
157
77
  protected
158
78
 
@@ -2,18 +2,8 @@
2
2
 
3
3
  module SmarterCSV
4
4
  class << self
5
- # this is processing the headers from the input file
5
+ # transform the headers that were in the file:
6
6
  def header_transformations(header_array, options)
7
- if options[:v2_mode]
8
- header_transformations_v2(header_array, options)
9
- else
10
- header_transformations_v1(header_array, options)
11
- end
12
- end
13
-
14
- # ---- V1.x Version: transform the headers that were in the file: ------------------------------------------
15
- #
16
- def header_transformations_v1(header_array, options)
17
7
  header_array.map!{|x| x.gsub(%r/#{options[:quote_char]}/, '')}
18
8
  header_array.map!{|x| x.strip} if options[:strip_whitespace]
19
9
 
@@ -67,99 +57,7 @@ module SmarterCSV
67
57
  header
68
58
  end
69
59
  end
70
-
71
60
  headers
72
61
  end
73
-
74
- # ---- V2.x Version: transform the headers that were in the file: ------------------------------------------
75
- #
76
- def header_transformations_v2(header_array, options)
77
- return header_array if options[:header_transformations].nil? || options[:header_transformations].empty?
78
-
79
- # do the header transformations the user requested:
80
- if options[:header_transformations]
81
- options[:header_transformations].each do |transformation|
82
- if transformation.respond_to?(:call) # this is used when a user-provided Proc is passed in
83
- header_array = transformation.call(header_array, options)
84
- else
85
- case transformation
86
- when Symbol # this is used for pre-defined transformations that are defined in the SmarterCSV module
87
- header_array = public_send(transformation, header_array, options)
88
- when Hash # this is called for hash arguments, e.g. header_transformations
89
- trans, args = transformation.first # .first treats the hash first element as an array
90
- header_array = apply_transformation(trans, header_array, args, options)
91
- when Array # this can be used for passing additional arguments in array form (e.g. into a Proc)
92
- trans, *args = transformation
93
- header_array = apply_transformation(trans, header_array, args, options)
94
- else
95
- raise SmarterCSV::IncorrectOption, "Invalid transformation type: #{transformation.class}"
96
- end
97
- end
98
- end
99
- end
100
-
101
- header_array
102
- end
103
-
104
- def apply_transformation(transformation, header_array, args, options)
105
- if transformation.respond_to?(:call)
106
- # If transformation is a callable object (like a Proc)
107
- transformation.call(header_array, args, options)
108
- else
109
- # If transformation is a symbol (method name)
110
- public_send(transformation, header_array, args, options)
111
- end
112
- end
113
-
114
- # pre-defined v2 header transformations:
115
-
116
- # these are some pre-defined header transformations which can be used
117
- # all these take the headers array as the input
118
- #
119
- # the computed options can be accessed via @options
120
-
121
- def keys_as_symbols(headers, options)
122
- headers.map do |header|
123
- header.strip.downcase.gsub(%r{#{options[:quote_char]}}, '').gsub(/(\s|-)+/, '_').to_sym
124
- end
125
- end
126
-
127
- def keys_as_strings(headers, options)
128
- headers.map do |header|
129
- header.strip.gsub(%r{#{options[:quote_char]}}, '').downcase.gsub(/(\s|-)+/, '_')
130
- end
131
- end
132
-
133
- def downcase_headers(headers, _options)
134
- headers.map do |header|
135
- header.strip.downcase!
136
- end
137
- end
138
-
139
- def key_mapping(headers, mapping = {}, options)
140
- raise(SmarterCSV::IncorrectOption, "ERROR: incorrect format for key_mapping! Expecting hash with from -> to mappings") if mapping.empty? || !mapping.is_a?(Hash)
141
-
142
- headers_set = headers.to_set
143
- mapping_keys_set = mapping.keys.to_set
144
- silence_keys_set = (options[:silence_missing_keys] || []).to_set
145
-
146
- # Check for missing keys
147
- missing_keys = mapping_keys_set - headers_set - silence_keys_set
148
- raise SmarterCSV::KeyMappingError, "ERROR: cannot map headers: #{missing_keys.to_a.join(', ')}" if missing_keys.any? && !options[:silence_missing_keys]
149
-
150
- # Apply key mapping, retaining nils for explicitly mapped headers
151
- headers.map do |header|
152
- if mapping.key?(header)
153
- # Maps the key according to the mapping, including nil mapping
154
- mapping[header]
155
- elsif options[:remove_unmapped_keys]
156
- # Remove headers not specified in the mapping
157
- nil
158
- else
159
- # Keep the original header if not specified in the mapping
160
- header
161
- end
162
- end
163
- end
164
62
  end
165
63
  end
@@ -3,21 +3,11 @@
3
3
  module SmarterCSV
4
4
  class << self
5
5
  def header_validations(headers, options)
6
- if options[:v2_mode]
7
- header_validations_v2(headers, options)
8
- else
9
- header_validations_v1(headers, options)
10
- end
11
- end
12
-
13
- # ---- V1.x Version: validate the headers -----------------------------------------------------------------
14
-
15
- def header_validations_v1(headers, options)
16
- check_duplicate_headers_v1(headers, options)
17
- check_required_headers_v1(headers, options)
6
+ check_duplicate_headers(headers, options)
7
+ check_required_headers(headers, options)
18
8
  end
19
9
 
20
- def check_duplicate_headers_v1(headers, _options)
10
+ def check_duplicate_headers(headers, _options)
21
11
  header_counts = Hash.new(0)
22
12
  headers.each { |header| header_counts[header] += 1 unless header.nil? }
23
13
 
@@ -28,109 +18,17 @@ module SmarterCSV
28
18
  end
29
19
  end
30
20
 
31
- def check_required_headers_v1(headers, options)
21
+ require 'set'
22
+
23
+ def check_required_headers(headers, options)
32
24
  if options[:required_keys] && options[:required_keys].is_a?(Array)
33
25
  headers_set = headers.to_set
34
26
  missing_keys = options[:required_keys].select { |k| !headers_set.include?(k) }
35
27
 
36
28
  unless missing_keys.empty?
37
- raise SmarterCSV::MissingKeys, "ERROR: missing attributes: #{missing_keys.join(',')}"
29
+ raise SmarterCSV::MissingKeys, "ERROR: missing attributes: #{missing_keys.join(',')}. Check `SmarterCSV.headers` for original headers."
38
30
  end
39
31
  end
40
32
  end
41
-
42
- # ---- V2.x Version: validate the headers -----------------------------------------------------------------
43
-
44
- # def header_validations_v2(headers, options)
45
- # return unless options[:header_validations]
46
-
47
- # options[:header_validations].each do |validation|
48
- # if validation.respond_to?(:call)
49
- # # Directly call if it's a Proc or lambda
50
- # validation.call(headers)
51
- # else
52
- # binding.pry
53
- # # Handle Symbol, Hash, or Array
54
- # method_name, args = validation.is_a?(Symbol) ? [validation, []] : validation
55
- # public_send(method_name, headers, *Array(args))
56
- # end
57
- # end
58
- # end
59
-
60
- def header_validations_v2(headers, options)
61
- return unless options[:header_validations]
62
-
63
- # do the header validations the user requested:
64
- # Header validations typically raise errors directly
65
- #
66
- options[:header_validations].each do |validation|
67
- if validation.respond_to?(:call)
68
- # Directly call if it's a Proc or lambda
69
- validation.call(headers)
70
- else
71
- case validation
72
- when Symbol
73
- public_send(validation, headers)
74
- when Hash
75
- val, args = validation.first
76
- public_send(val, headers, args)
77
- when Array
78
- val, *args = validation
79
- public_send(val, headers, args)
80
- else
81
- raise SmarterCSV::IncorrectOption, "Invalid validation type: #{validation.class}"
82
- end
83
- end
84
- end
85
- end
86
-
87
- # def header_validations_v2_orig(headers, options)
88
- # # do the header validations the user requested:
89
- # # Header validations typically raise errors directly
90
- # #
91
- # if options[:header_validations]
92
- # options[:header_validations].each do |validation|
93
- # case validation
94
- # when Symbol
95
- # public_send(validation, headers)
96
- # when Hash
97
- # val, args = validation.first
98
- # public_send(val, headers, args)
99
- # when Array
100
- # val, args = validation
101
- # public_send(val, headers, args)
102
- # else
103
- # validation.call(headers) unless validation.nil?
104
- # end
105
- # end
106
- # end
107
- # end
108
-
109
- # these are some pre-defined header validations which can be used
110
- # all these take the headers array as the input
111
- #
112
- # the computed options can be accessed via @options
113
-
114
- def unique_headers(headers)
115
- header_counts = Hash.new(0)
116
- headers.each { |header| header_counts[header] += 1 unless header.nil? }
117
-
118
- duplicates = header_counts.select { |_, count| count > 1 }
119
-
120
- unless duplicates.empty?
121
- raise(SmarterCSV::DuplicateHeaders, "Duplicate Headers in CSV: #{duplicates.inspect}")
122
- end
123
- end
124
-
125
- def required_headers(headers, required = [])
126
- raise(SmarterCSV::IncorrectOption, "ERROR: required_headers validation needs an array argument") unless required.is_a?(Array)
127
-
128
- headers_set = headers.to_set
129
- missing = required.select { |r| !headers_set.include?(r) }
130
-
131
- unless missing.empty?
132
- raise(SmarterCSV::MissingKeys, "Missing Headers in CSV: #{missing.inspect}")
133
- end
134
- end
135
33
  end
136
34
  end
@@ -1,7 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module SmarterCSV
4
- COMMON_OPTIONS = {
4
+ DEFAULT_OPTIONS = {
5
5
  acceleration: true,
6
6
  auto_row_sep_chars: 500,
7
7
  chunk_size: nil,
@@ -15,66 +15,39 @@ module SmarterCSV
15
15
  force_utf8: false,
16
16
  headers_in_file: true,
17
17
  invalid_byte_sequence: '',
18
- quote_char: '"',
19
- remove_unmapped_keys: false,
20
- row_sep: :auto, # was: $/,
21
- silence_deprecations: false, # new in 1.11
22
- silence_missing_keys: false,
23
- skip_lines: nil,
24
- user_provided_headers: nil,
25
- verbose: false,
26
- with_line_numbers: false,
27
- v2_mode: false,
28
- }.freeze
29
-
30
- V1_DEFAULT_OPTIONS = {
31
18
  keep_original_headers: false,
32
19
  key_mapping: nil,
20
+ quote_char: '"',
33
21
  remove_empty_hashes: true,
34
22
  remove_empty_values: true,
23
+ remove_unmapped_keys: false,
35
24
  remove_values_matching: nil,
36
25
  remove_zero_values: false,
37
26
  required_headers: nil,
38
27
  required_keys: nil,
28
+ row_sep: :auto, # was: $/,
29
+ silence_missing_keys: false,
30
+ skip_lines: nil,
39
31
  strings_as_keys: false,
40
32
  strip_chars_from_headers: nil,
41
33
  strip_whitespace: true,
34
+ user_provided_headers: nil,
42
35
  value_converters: nil,
43
- v2_mode: false,
36
+ verbose: false,
37
+ with_line_numbers: false,
44
38
  }.freeze
45
39
 
46
- DEPRECATED_OPTIONS = [
47
- :convert_values_to_numeric,
48
- :downcase_headers,
49
- :keep_original_headers,
50
- :key_mapping,
51
- :remove_empty_hashes,
52
- :remove_empty_values,
53
- :remove_values_matching,
54
- :remove_zero_values,
55
- :required_headers,
56
- :required_keys,
57
- :stirngs_as_keys,
58
- :strip_cars_from_headers,
59
- :strip_whitespace,
60
- :value_converters,
61
- ].freeze
62
-
63
40
  class << self
64
41
  # NOTE: this is not called when "parse" methods are tested by themselves
65
42
  def process_options(given_options = {})
66
43
  puts "User provided options:\n#{pp(given_options)}\n" if given_options[:verbose]
67
44
 
68
- # fix invalid input
69
- given_options[:invalid_byte_sequence] = '' if given_options[:invalid_byte_sequence].nil?
70
-
71
- # warn about deprecated options / raises error for v2_mode
72
- handle_deprecations(given_options)
45
+ @options = DEFAULT_OPTIONS.dup.merge!(given_options)
73
46
 
74
- given_options = preprocess_v2_options(given_options) if given_options[:v2_mode]
47
+ # fix invalid input
48
+ @options[:invalid_byte_sequence] ||= ''
75
49
 
76
- @options = compute_default_options(given_options).merge!(given_options)
77
- puts "Computed options:\n#{pp(@options)}\n" if given_options[:verbose]
50
+ puts "Computed options:\n#{pp(@options)}\n" if @options[:verbose]
78
51
 
79
52
  validate_options!(@options)
80
53
  @options
@@ -84,35 +57,11 @@ module SmarterCSV
84
57
  #
85
58
  # ONLY FOR BACKWARDS-COMPATIBILITY
86
59
  def default_options
87
- COMMON_OPTIONS.merge(V1_DEFAULT_OPTIONS)
60
+ DEFAULT_OPTIONS
88
61
  end
89
62
 
90
63
  private
91
64
 
92
- def compute_default_options(options = {})
93
- return COMMON_OPTIONS.merge(V1_DEFAULT_OPTIONS) unless options[:v2_mode]
94
-
95
- default_options = {}
96
- if options[:defaults].to_s != 'none'
97
- default_options = COMMON_OPTIONS.dup.merge(V2_DEFAULT_OPTIONS)
98
- if options[:defaults].to_s == 'v1'
99
- default_options.merge(V1_TRANSFORMATIONS)
100
- else
101
- default_options.merge(V2_TRANSFORMATIONS)
102
- end
103
- end
104
- end
105
-
106
- def handle_deprecations(options)
107
- used_deprecated_options = DEPRECATED_OPTIONS & options.keys
108
- message = "SmarterCSV #{VERSION} DEPRECATED OPTIONS: #{pp(used_deprecated_options)}"
109
- if options[:v2_mode]
110
- raise(SmarterCSV::DeprecatedOptions, "ERROR: #{message}") unless used_deprecated_options.empty? || options[:silence_deprecations]
111
- else
112
- puts "DEPRECATION WARNING: #{message}" unless used_deprecated_options.empty? || options[:silence_deprecations]
113
- end
114
- end
115
-
116
65
  def validate_options!(options)
117
66
  # deprecate required_headers
118
67
  unless options[:required_headers].nil?
@@ -141,57 +90,5 @@ module SmarterCSV
141
90
  def pp(value)
142
91
  defined?(AwesomePrint) ? value.awesome_inspect(index: nil) : value.inspect
143
92
  end
144
-
145
- # ---- V2 code ----------------------------------------------------------------------------------------
146
-
147
- V2_DEFAULT_OPTIONS = {
148
- # These need to go to the COMMON_OPTIONS:
149
- remove_empty_hashes: true, # this might need a transformation or move to common options
150
- # ------------
151
- header_transformations: [:keys_as_symbols],
152
- header_validations: [:unique_headers],
153
- # data_transformations: [:replace_blank_with_nil],
154
- # data_validations: [],
155
- hash_transformations: [:strip_spaces, :remove_blank_values],
156
- hash_validations: [],
157
- v2_mode: true,
158
- }.freeze
159
-
160
- V2_TRANSFORMATIONS = {
161
- header_transformations: [:keys_as_symbols],
162
- header_validations: [:unique_headers],
163
- # data_transformations: [:replace_blank_with_nil],
164
- # data_validations: [],
165
- hash_transformations: [:v1_backwards_compatibility],
166
- # hash_transformations: [:remove_empty_keys, :strip_spaces, :remove_blank_values, :convert_values_to_numeric], # ??? :convert_values_to_numeric]
167
- hash_validations: [],
168
- }.freeze
169
-
170
- V1_TRANSFORMATIONS = {
171
- header_transformations: [:keys_as_symbols],
172
- header_validations: [:unique_headers],
173
- # data_transformations: [:replace_blank_with_nil],
174
- # data_validations: [],
175
- hash_transformations: [:strip_spaces, :remove_blank_values, :convert_values_to_numeric],
176
- hash_validations: [],
177
- }.freeze
178
-
179
- def preprocess_v2_options(options)
180
- return options unless options[:v2_mode] || options[:header_transformations]
181
-
182
- # We want to provide safe defaults for easy processing, that is why we have a special keyword :none
183
- # to not do any header transformations..
184
- #
185
- # this is why we need to remove the 'none' here:
186
- #
187
- requested_header_transformations = options[:header_transformations]
188
- if requested_header_transformations.to_s == 'none'
189
- requested_header_transformations = []
190
- else
191
- requested_header_transformations = requested_header_transformations.reject {|x| x.to_s == 'none'} unless requested_header_transformations.nil?
192
- end
193
- options[:header_transformations] = requested_header_transformations || []
194
- options
195
- end
196
93
  end
197
94
  end
@@ -2,7 +2,6 @@
2
2
 
3
3
  module SmarterCSV
4
4
  class SmarterCSVException < StandardError; end
5
- class DeprecatedOptions < SmarterCSVException; end
6
5
  class HeaderSizeMismatch < SmarterCSVException; end
7
6
  class IncorrectOption < SmarterCSVException; end
8
7
  class ValidationError < SmarterCSVException; end
@@ -109,10 +108,6 @@ module SmarterCSV
109
108
 
110
109
  next if options[:remove_empty_hashes] && hash.empty?
111
110
 
112
- #
113
- # should HASH VALIDATIONS go here instead?
114
- #
115
-
116
111
  puts "CSV Line #{@file_line_count}: #{pp(hash)}" if @verbose == '2' # very verbose setting
117
112
  # optional adding of csv_line_number to the hash to help debugging
118
113
  hash[:csv_line_number] = @csv_line_count if options[:with_line_numbers]
@@ -170,19 +165,22 @@ module SmarterCSV
170
165
  end
171
166
 
172
167
  class << self
173
- # Counts the number of quote characters in a line, excluding escaped quotes.
174
- # FYI: using Ruby built-in regex processing to determine the number of quotes
175
168
  def count_quote_chars(line, quote_char)
176
169
  return 0 if line.nil? || quote_char.nil? || quote_char.empty?
177
170
 
178
- # Escaped quote character (e.g., if quote_char is ", then escaped is \")
179
- escaped_quote = Regexp.escape(quote_char)
171
+ count = 0
172
+ escaped = false
180
173
 
181
- # Pattern to match a quote character not preceded by a backslash
182
- pattern = /(?<!\\)(?:\\\\)*#{escaped_quote}/
174
+ line.each_char do |char|
175
+ if char == '\\' && !escaped
176
+ escaped = true
177
+ else
178
+ count += 1 if char == quote_char && !escaped
179
+ escaped = false
180
+ end
181
+ end
183
182
 
184
- # Count occurrences
185
- line.scan(pattern).count
183
+ count
186
184
  end
187
185
 
188
186
  def has_acceleration?
@@ -15,7 +15,6 @@ module SmarterCSV
15
15
  @raw_header = nil # header as it appears in the file
16
16
  @result = []
17
17
  @warnings = {}
18
- @v2_mode = false
19
18
  @enforce_utf8 = false # only set to true if needed (after options parsing)
20
19
  end
21
20
 
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module SmarterCSV
4
- VERSION = "1.11.0.pre2"
4
+ VERSION = "1.11.0"
5
5
  end
@@ -0,0 +1,102 @@
1
+ # frozen_string_literal: true
2
+
3
+ module SmarterCSV
4
+ #
5
+ # Generate CSV files
6
+ #
7
+ # Create an instance of the Writer class with the filename and options.
8
+ # call `<<` one or mulltiple times to append data to the file.
9
+ # call `finalize` to save the file.
10
+ #
11
+ # The `<<` method can take different arguments:
12
+ # * a signle Hash
13
+ # * an array of Hashes
14
+ # * nested arrays of arrays of Hashes
15
+ #
16
+ # By default SmarterCSV::Writer automatically discovers all headers that are present
17
+ # in the data on-the-fly. This can be disabled, then only given headers are used.
18
+ # Disabling can be useful when you want to select attributes from hashes, or ActiveRecord instances.
19
+ #
20
+ # If `discover_headers` is enabled, and headers are given, any new headers that are found in the data will still be appended.
21
+ #
22
+ # The Writer automatically quotes fields containing the col_sep, row_sep, or the quote_char.
23
+ #
24
+ # Options:
25
+ # col_sep : defaults to , but can be set to any other character
26
+ # row_sep : defaults to LF \n , but can be set to \r\n or \r or anything else
27
+ # quote_char : defaults to "
28
+ # discover_headers : defaults to true
29
+ # headers : defaults to []
30
+ # force_quotes: defaults to false
31
+ # map_headers: defaults to {}, can be a hash of key -> value mappings
32
+
33
+ # IMPORTANT NOTES:
34
+ # * Data hashes could contain strings or symbols as keys.
35
+ # Make sure to use the correct form when specifying headers manually,
36
+ # in combination with the :discover_headers option
37
+
38
+ class Writer
39
+ def initialize(file_path, options = {})
40
+ @options = options
41
+ @discover_headers = options.has_key?(:discover_headers) ? (options[:discover_headers] == true) : true
42
+ @headers = options[:headers] || []
43
+ @row_sep = options[:row_sep] || "\n" # RFC4180 "\r\n"
44
+ @col_sep = options[:col_sep] || ','
45
+ @quote_char = '"'
46
+ @force_quotes = options[:force_quotes] == true
47
+ @map_headers = options[:map_headers] || {}
48
+ @output_file = File.open(file_path, 'w+')
49
+ # hidden state:
50
+ @temp_file = Tempfile.new('tempfile', '/tmp')
51
+ @quote_regex = Regexp.union(@col_sep, @row_sep, @quote_char)
52
+ end
53
+
54
+ def <<(data)
55
+ case data
56
+ when Hash
57
+ process_hash(data)
58
+ when Array
59
+ data.each { |item| self << item }
60
+ when NilClass
61
+ # ignore
62
+ else
63
+ raise ArgumentError, "Invalid data type: #{data.class}. Must be a Hash or an Array."
64
+ end
65
+ end
66
+
67
+ def finalize
68
+ # Map headers if :map_headers option is provided
69
+ mapped_headers = @headers.map { |header| @map_headers[header] || header }
70
+
71
+ @temp_file.rewind
72
+ @output_file.write(mapped_headers.join(@col_sep) + @row_sep)
73
+ @output_file.write(@temp_file.read)
74
+ @output_file.flush
75
+ @output_file.close
76
+ @temp_file.delete
77
+ end
78
+
79
+ private
80
+
81
+ def process_hash(hash)
82
+ if @discover_headers
83
+ hash_keys = hash.keys
84
+ new_keys = hash_keys - @headers
85
+ @headers.concat(new_keys)
86
+ end
87
+
88
+ # Reorder the hash to match the current headers order and fill missing fields
89
+ ordered_row = @headers.map { |header| hash[header] || '' }
90
+
91
+ @temp_file.write ordered_row.map { |value| escape_csv_field(value) }.join(@col_sep) + @row_sep
92
+ end
93
+
94
+ def escape_csv_field(field)
95
+ if @force_quotes || field.to_s.match(@quote_regex)
96
+ "\"#{field}\""
97
+ else
98
+ field.to_s
99
+ end
100
+ end
101
+ end
102
+ end
data/lib/smarter_csv.rb CHANGED
@@ -1,7 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
- require 'set'
4
-
5
3
  require "smarter_csv/version"
6
4
  require "smarter_csv/file_io"
7
5
  require "smarter_csv/options_processing"
@@ -12,6 +10,7 @@ require 'smarter_csv/header_validations'
12
10
  require "smarter_csv/headers"
13
11
  require "smarter_csv/hash_transformations"
14
12
  require "smarter_csv/parse"
13
+ require "smarter_csv/writer"
15
14
 
16
15
  # load the C-extension:
17
16
  case RUBY_ENGINE
data/smarter_csv.gemspec CHANGED
@@ -9,8 +9,8 @@ Gem::Specification.new do |spec|
9
9
  spec.authors = ["Tilo Sloboda"]
10
10
  spec.email = ["tilo.sloboda@gmail.com"]
11
11
 
12
- spec.summary = "Ruby Gem for smarter importing of CSV Files (and CSV-like files), with lots of optional features, e.g. chunked processing for huge CSV files"
13
- spec.description = "Ruby Gem for smarter importing of CSV Files as Array(s) of Hashes, with optional features for processing large files in parallel, embedded comments, unusual field- and record-separators, flexible mapping of CSV-headers to Hash-keys"
12
+ spec.summary = "CSV Reading and Writing"
13
+ spec.description = "Ruby Gem for smarter importing of CSV Files as Array(s) of Hashes, with lots of features for processing large files in parallel, embedded comments, unusual field- and record-separators, flexible mapping of CSV-headers to Hash-keys"
14
14
  spec.homepage = "https://github.com/tilo/smarter_csv"
15
15
  spec.license = 'MIT'
16
16
 
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: smarter_csv
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.11.0.pre2
4
+ version: 1.11.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Tilo Sloboda
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2024-01-14 00:00:00.000000000 Z
11
+ date: 2024-07-02 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: awesome_print
@@ -95,7 +95,7 @@ dependencies:
95
95
  - !ruby/object:Gem::Version
96
96
  version: '0'
97
97
  description: Ruby Gem for smarter importing of CSV Files as Array(s) of Hashes, with
98
- optional features for processing large files in parallel, embedded comments, unusual
98
+ lots of features for processing large files in parallel, embedded comments, unusual
99
99
  field- and record-separators, flexible mapping of CSV-headers to Hash-keys
100
100
  email:
101
101
  - tilo.sloboda@gmail.com
@@ -104,6 +104,7 @@ extensions:
104
104
  - ext/smarter_csv/extconf.rb
105
105
  extra_rdoc_files: []
106
106
  files:
107
+ - ".rspec"
107
108
  - ".rubocop.yml"
108
109
  - ".rvmrc"
109
110
  - CHANGELOG.md
@@ -127,6 +128,7 @@ files:
127
128
  - lib/smarter_csv/smarter_csv.rb
128
129
  - lib/smarter_csv/variables.rb
129
130
  - lib/smarter_csv/version.rb
131
+ - lib/smarter_csv/writer.rb
130
132
  - smarter_csv.gemspec
131
133
  homepage: https://github.com/tilo/smarter_csv
132
134
  licenses:
@@ -147,13 +149,12 @@ required_ruby_version: !ruby/object:Gem::Requirement
147
149
  version: 2.5.0
148
150
  required_rubygems_version: !ruby/object:Gem::Requirement
149
151
  requirements:
150
- - - ">"
152
+ - - ">="
151
153
  - !ruby/object:Gem::Version
152
- version: 1.3.1
154
+ version: '0'
153
155
  requirements: []
154
156
  rubygems_version: 3.2.3
155
157
  signing_key:
156
158
  specification_version: 4
157
- summary: Ruby Gem for smarter importing of CSV Files (and CSV-like files), with lots
158
- of optional features, e.g. chunked processing for huge CSV files
159
+ summary: CSV Reading and Writing
159
160
  test_files: []