csvreader 0.1.0 → 0.2.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: e6b05520911361eae641ef53b9550d11974478fe
4
- data.tar.gz: a119bd4a8e85408b84b9683d8fa5aa61884bfb1e
3
+ metadata.gz: af0fcea1b598e6123786a05532a6f5b2e10a4095
4
+ data.tar.gz: ba2dc18a6076e425847b440c05819e898f0a66b2
5
5
  SHA512:
6
- metadata.gz: a0938a895f630f06740ecb2d947adccb3f6384ad1f7c06316bde43d60b750fae76fda227138c5827c7da404590b96102feb8089e5fb83d056c180a313a4b17e1
7
- data.tar.gz: 14cc7b7da521649b5713fb6932a9d201962fbf5bf9df2f09569066baaa29bdffc8edcc89e7903d2aceb7eb6df1c3f05ebff4b27623dc8a427cfaba0ddf4df8f1
6
+ metadata.gz: 28f60b98574e5331b53280f27017fae776c787ee1b7a56815c8a8f9c21a0926e6f561ca8a75f1464f1743849f989a52f122fbe4f20086de8159cf2df53b71bbe
7
+ data.tar.gz: 6d5b80b11e4774bc227bffe62bc829ab19b70f22fd69ca35c54526b85f261fa5c4bf0a7d87c9ba715738f50a6710bdd843f3b6cc1581f0d88744332fdf062796
data/Manifest.txt CHANGED
@@ -8,5 +8,6 @@ lib/csvreader/reader.rb
8
8
  lib/csvreader/version.rb
9
9
  test/data/beer.csv
10
10
  test/data/beer11.csv
11
+ test/data/shakespeare.csv
11
12
  test/helper.rb
12
13
  test/test_reader.rb
data/README.md CHANGED
@@ -11,6 +11,179 @@
11
11
 
12
12
  ## Usage
13
13
 
14
+ ``` ruby
15
+ line = "1,2,3"
16
+ values = CsvReader.parse_line( line )
17
+ pp values
18
+ # => ["1","2","3"]
19
+ ```
20
+
21
+ or use the convenience helpers:
22
+
23
+ ``` ruby
24
+ txt <<=TXT
25
+ 1,2,3
26
+ 4,5,6
27
+ TXT
28
+
29
+ records = CsvReader.parse( txt )
30
+ pp records
31
+ # => [["1","2","3"],
32
+ # ["5","6","7"]]
33
+
34
+ # -or-
35
+
36
+ records = CsvReader.read( "values.csv" )
37
+ pp records
38
+ # => [["1","2","3"],
39
+ # ["5","6","7"]]
40
+
41
+ # -or-
42
+
43
+ CsvReader.foreach( "values.csv" ) do |rec|
44
+ pp rec
45
+ end
46
+ # => ["1","2","3"]
47
+ # => ["5","6","7"]
48
+ ```
49
+
50
+
51
+ ### What about headers?
52
+
53
+ Use the `CsvHashReader`
54
+ if the first line is a header (or if missing pass in the headers
55
+ as an array) and you want your records as hashes instead of arrays of strings.
56
+ Example:
57
+
58
+ ``` ruby
59
+ txt <<=TXT
60
+ A,B,C
61
+ 1,2,3
62
+ 4,5,6
63
+ TXT
64
+
65
+ records = CsvHashReader.parse( txt )
66
+ pp records
67
+
68
+ # -or-
69
+
70
+ txt2 <<=TXT
71
+ 1,2,3
72
+ 4,5,6
73
+ TXT
74
+
75
+ records = CsvHashReader.parse( txt2, headers: ["A","B","C"] )
76
+ pp records
77
+
78
+ # => [{"A": "1", "B": "2", "C": "3"},
79
+ # {"A": "4", "B": "5", "C": "6"}]
80
+
81
+ # -or-
82
+
83
+ records = CsvHashReader.read( "hash.csv" )
84
+ pp records
85
+ # => [{"A": "1", "B": "2", "C": "3"},
86
+ # {"A": "4", "B": "5", "C": "6"}]
87
+
88
+ # -or-
89
+
90
+ CsvHashReader.foreach( "hash.csv" ) do |rec|
91
+ pp rec
92
+ end
93
+ # => {"A": "1", "B": "2", "C": "3"}
94
+ # => {"A": "4", "B": "5", "C": "6"}
95
+ ```
96
+
97
+
98
+
99
+ ## Frequently Asked Questions (FAQ) and Answers
100
+
101
+ ### Q: What's CSV the right way? What best practices can I use?
102
+
103
+ Use best practices out-of-the-box with zero-configuration.
104
+ Do you know how to skip blank lines or how to add `#` single-line comments?
105
+ Or how to trim leading and trailing spaces? No worries. It's turned on by default.
106
+
107
+ Yes, you can. Use
108
+
109
+ ```
110
+ #######
111
+ # try with some comments
112
+ # and blank lines even before header (first row)
113
+
114
+ Brewery,City,Name,Abv
115
+ Andechser Klosterbrauerei,Andechs,Doppelbock Dunkel,7%
116
+ Augustiner Bräu München,München,Edelstoff,5.6%
117
+
118
+ Bayerische Staatsbrauerei Weihenstephan, Freising, Hefe Weissbier, 5.4%
119
+ Brauerei Spezial, Bamberg, Rauchbier Märzen, 5.1%
120
+ Hacker-Pschorr Bräu, München, Münchner Dunkel, 5.0%
121
+ Staatliches Hofbräuhaus München, München, Hofbräu Oktoberfestbier, 6.3%
122
+ ```
123
+
124
+ instead of strict "classic"
125
+ (no blank lines, no comments, no leading and trailing spaces, etc.):
126
+
127
+ ```
128
+ Brewery,City,Name,Abv
129
+ Andechser Klosterbrauerei,Andechs,Doppelbock Dunkel,7%
130
+ Augustiner Bräu München,München,Edelstoff,5.6%
131
+ Bayerische Staatsbrauerei Weihenstephan,Freising,Hefe Weissbier,5.4%
132
+ Brauerei Spezial,Bamberg,Rauchbier Märzen,5.1%
133
+ Hacker-Pschorr Bräu,München,Münchner Dunkel,5.0%
134
+ Staatliches Hofbräuhaus München,München,Hofbräu Oktoberfestbier,6.3%
135
+ ```
136
+
137
+
138
+
139
+ ### Q: How can I change the separator to semicolon (`;`) or pipe (`|`)?
140
+
141
+ Pass in the `sep` keyword option. Example:
142
+
143
+ ``` ruby
144
+ CsvReader.parse_line( ..., sep: ';' )
145
+ CsvReader.parse( ..., sep: ';' )
146
+ CsvReader.read( ..., sep: ';' )
147
+ # ...
148
+ CsvReader.parse_line( ..., sep: '|' )
149
+ CsvReader.parse( ..., sep: '|' )
150
+ CsvReader.read( ..., sep: '|' )
151
+ # ...
152
+ # and so on
153
+ ```
154
+
155
+
156
+ Note: If you use tab (`\t`) use the `TabReader`! Why? Tab =! CSV. Yes, tab is
157
+ its own (even) simpler format
158
+ (e.g. no escape rules, no newlines in values, etc.),
159
+ see [`TabReader` »](https://github.com/datatxt/tabreader).
160
+
161
+
162
+
163
+ ### Q: What's broken in the standard library CSV reader?
164
+
165
+ Two major design bugs and many many minor.
166
+
167
+ 1) The CSV class uses `line.split(`,`)` with some kludges (†) with the claim its faster.
168
+ What?! The right way: CSV needs its own purpose-built parser. There's no other
169
+ way you can handle all the (edge) cases with double quotes and escaped doubled up
170
+ double quotes. Period.
171
+
172
+ For example, the CSV class cannot handle leading or trailing spaces
173
+ for double quoted values `1,•"2","3"•`.
174
+ Or handling double quotes inside values and so on and on.
175
+
176
+ (†): kludge - a workaround or quick-and-dirty solution that is clumsy, inelegant, inefficient, difficult to extend and hard to maintain
177
+
178
+ 2) The CSV class returns `nil` for `,,` but an empty string (`""`)
179
+ for `"","",""`. The right way: All values are always strings. Period.
180
+
181
+ If you want to use `nil` you MUST configure a string (or strings)
182
+ such as `NA`, `n/a`, `\N`, or similar that map to `nil`.
183
+
184
+
185
+
186
+
14
187
 
15
188
  ## Alternatives
16
189
 
@@ -96,21 +96,7 @@ end # module Csvv
96
96
 
97
97
  class CsvReader
98
98
 
99
- ####################
100
- # helper methods
101
- def self.unwrap( row_or_array ) ## unwrap row - find a better name? why? why not?
102
- ## return row values as array of strings
103
- if row_or_array.is_a?( CSV::Row )
104
- row = row_or_array
105
- row.fields ## gets array of string of field values
106
- else ## assume "classic" array of strings
107
- array = row_or_array
108
- end
109
- end
110
-
111
-
112
-
113
- def self.foreach( path, sep: Csv.config.sep, headers: true )
99
+ def self.foreach( path, sep: Csv.config.sep, headers: false )
114
100
  csv_options = Csv.config.default_options.merge(
115
101
  headers: headers,
116
102
  col_sep: sep,
@@ -122,8 +108,7 @@ class CsvReader
122
108
  end
123
109
  end
124
110
 
125
-
126
- def self.read( path, sep: Csv.config.sep, headers: true )
111
+ def self.read( path, sep: Csv.config.sep, headers: false )
127
112
  ## note: use our own file.open
128
113
  ## always use utf-8 for now
129
114
  ## check/todo: add skip option bom too - why? why not?
@@ -131,7 +116,7 @@ class CsvReader
131
116
  parse( txt, sep: sep, headers: headers )
132
117
  end
133
118
 
134
- def self.parse( txt, sep: Csv.config.sep, headers: true )
119
+ def self.parse( txt, sep: Csv.config.sep, headers: false )
135
120
  csv_options = Csv.config.default_options.merge(
136
121
  headers: headers,
137
122
  col_sep: sep
@@ -140,6 +125,7 @@ class CsvReader
140
125
  CSV.parse( txt, csv_options )
141
126
  end
142
127
 
128
+
143
129
  def self.parse_line( txt, sep: Csv.config.sep )
144
130
  ## note: do NOT include headers option (otherwise single row gets skipped as first header row :-)
145
131
  csv_options = Csv.config.default_options.merge(
@@ -151,7 +137,6 @@ class CsvReader
151
137
  end
152
138
 
153
139
 
154
-
155
140
  def self.header( path, sep: Csv.config.sep ) ## use header or headers - or use both (with alias)?
156
141
  # read first lines (only)
157
142
  # and parse with csv to get header from csv library itself
@@ -185,4 +170,38 @@ class CsvReader
185
170
  ## hash record does NOT work for single line/row
186
171
  parse_line( lines, sep: sep )
187
172
  end # method self.header
173
+
174
+ ####################
175
+ # helper methods
176
+ def self.unwrap( row_or_array ) ## unwrap row - find a better name? why? why not?
177
+ ## return row values as array of strings
178
+ if row_or_array.is_a?( CSV::Row )
179
+ row = row_or_array
180
+ row.fields ## gets array of string of field values
181
+ else ## assume "classic" array of strings
182
+ array = row_or_array
183
+ end
184
+ end
188
185
  end # class CsvReader
186
+
187
+
188
+
189
+ class CsvHashReader
190
+
191
+ def self.read( path, sep: Csv.config.sep, headers: true )
192
+ CsvReader.read( path, sep: sep, headers: headers )
193
+ end
194
+
195
+ def self.parse( txt, sep: Csv.config.sep, headers: true )
196
+ CsvReader.parse( txt, sep: sep, headers: headers )
197
+ end
198
+
199
+ def self.foreach( path, sep: Csv.config.sep, headers: true, &block )
200
+ CsvReader.foreach( path, sep: sep, headers: headers, &block )
201
+ end
202
+
203
+ def self.header( path, sep: Csv.config.sep ) ## add header too? why? why not?
204
+ CsvReader.header( path, sep: sep )
205
+ end
206
+
207
+ end # class CsvHashReader
@@ -4,7 +4,7 @@
4
4
  class CsvReader ## note: uses a class for now - change to module - why? why not?
5
5
 
6
6
  MAJOR = 0 ## todo: namespace inside version or something - why? why not??
7
- MINOR = 1
7
+ MINOR = 2
8
8
  PATCH = 0
9
9
  VERSION = [MAJOR,MINOR,PATCH].join('.')
10
10
 
@@ -0,0 +1,9 @@
1
+ Quote,Play,Cite
2
+ Sweet are the uses of adversity,As You Like It,"Act 2, scene 1, 12"
3
+ All the world's a stage,As You Like It,"Act 2, scene 7, 139"
4
+ "We few, we happy few",Henry V,
5
+ """Seems,"" madam! Nay it is; I know not ""seems.""",Hamlet,(1.ii.76)
6
+ "To be, or not to be",Hamlet,"Act 3, scene 1, 55"
7
+ What's in a name? That which we call a rose by any other name would smell as sweet.,Romeo and Juliet,"(II, ii, 1-2)"
8
+ "O Romeo, Romeo, wherefore art thou Romeo?",Romeo and Juliet,"Act 2, scene 2, 33"
9
+ "Tomorrow, and tomorrow, and tomorrow",Macbeth,"Act 5, scene 5, 19"
data/test/test_reader.rb CHANGED
@@ -9,39 +9,40 @@ require 'helper'
9
9
 
10
10
  class TestReader < MiniTest::Test
11
11
 
12
+
12
13
  def test_read
13
14
  puts "== read: beer.csv:"
14
- table = CsvReader.read( "#{CsvReader.test_data_dir}/beer.csv" ) ## returns CSV::Table
15
+ data = CsvReader.read( "#{CsvReader.test_data_dir}/beer.csv" )
15
16
 
16
- pp table.class.name
17
- pp table
18
- pp table.to_a ## note: includes header (first row with column names)
17
+ pp data.class.name
18
+ pp data
19
19
 
20
- table.each do |row| ## note: will skip (NOT include) header row!!
20
+ data.each do |row|
21
21
  pp row
22
22
  end
23
- puts " #{table.size} rows" ## note: again will skip (NOT include) header row in count!!!
24
- assert_equal 6, table.size
23
+ puts " #{data.size} rows"
24
+ assert_equal 7, data.size ## note: include header row in count
25
25
  end
26
26
 
27
- def test_read_header_false
28
- puts "== read (headers: false): beer.csv:"
29
- data = CsvReader.read( "#{CsvReader.test_data_dir}/beer.csv", headers: false )
27
+ def test_read_hash
28
+ puts "== read (hash): beer.csv:"
29
+ table = CsvHashReader.read( "#{CsvReader.test_data_dir}/beer.csv" ) ## returns CSV::Table
30
30
 
31
- pp data.class.name
32
- pp data
31
+ pp table.class.name
32
+ pp table
33
+ pp table.to_a ## note: includes header (first row with column names)
33
34
 
34
- data.each do |row|
35
+ table.each do |row| ## note: will skip (NOT include) header row!!
35
36
  pp row
36
37
  end
37
- puts " #{data.size} rows"
38
- assert_equal 7, data.size ## note: include header row in count
38
+ puts " #{table.size} rows" ## note: again will skip (NOT include) header row in count!!!
39
+ assert_equal 6, table.size
39
40
  end
40
41
 
41
42
 
42
- def test_read11
43
- puts "== read: beer11.csv:"
44
- table = CsvReader.read( "#{CsvReader.test_data_dir}/beer11.csv" )
43
+ def test_read_hash11
44
+ puts "== read (hash): beer11.csv:"
45
+ table = CsvHashReader.read( "#{CsvReader.test_data_dir}/beer11.csv" )
45
46
  pp table
46
47
  pp table.to_a ## note: includes header (first row with column names)
47
48
 
@@ -90,28 +91,29 @@ def test_header11
90
91
  end
91
92
 
92
93
 
94
+
93
95
  def test_foreach
94
- puts "== foreach: beer.csv:"
95
- CsvReader.foreach( "#{CsvReader.test_data_dir}/beer.csv" ) do |row|
96
- pp row
97
- pp row.fields
96
+ puts "== foreach: beer11.csv:"
97
+ CsvReader.foreach( "#{CsvReader.test_data_dir}/beer11.csv" ) do |row|
98
+ pp row ## note: is Array (no .fields available!!!!!)
98
99
  end
99
100
  assert true
100
101
  end
101
102
 
102
- def test_foreach11
103
- puts "== foreach: beer11.csv:"
104
- CsvReader.foreach( "#{CsvReader.test_data_dir}/beer11.csv" ) do |row|
103
+ def test_foreach_hash
104
+ puts "== foreach (hash): beer.csv:"
105
+ CsvHashReader.foreach( "#{CsvReader.test_data_dir}/beer.csv" ) do |row|
105
106
  pp row
106
107
  pp row.fields
107
108
  end
108
109
  assert true
109
110
  end
110
111
 
111
- def test_foreach_header_false
112
- puts "== foreach (headers: false): beer11.csv:"
113
- CsvReader.foreach( "#{CsvReader.test_data_dir}/beer11.csv", headers: false ) do |row|
114
- pp row ## note: is Array (no .fields available!!!!!)
112
+ def test_foreach_hash11
113
+ puts "== foreach (hash): beer11.csv:"
114
+ CsvHashReader.foreach( "#{CsvReader.test_data_dir}/beer11.csv" ) do |row|
115
+ pp row
116
+ pp row.fields
115
117
  end
116
118
  assert true
117
119
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: csvreader
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Gerald Bauer
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2018-08-15 00:00:00.000000000 Z
11
+ date: 2018-08-19 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rdoc
@@ -59,6 +59,7 @@ files:
59
59
  - lib/csvreader/version.rb
60
60
  - test/data/beer.csv
61
61
  - test/data/beer11.csv
62
+ - test/data/shakespeare.csv
62
63
  - test/helper.rb
63
64
  - test/test_reader.rb
64
65
  homepage: https://github.com/csv11/csvreader