RubyGems - csvreader - Versions diffs - 0.1.0 → 0.2.0 - Mend

csvreader 0.1.0 → 0.2.0

Files changed (8) hide show

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA1:
-  metadata.gz: e6b05520911361eae641ef53b9550d11974478fe
-  data.tar.gz: a119bd4a8e85408b84b9683d8fa5aa61884bfb1e
+  metadata.gz: af0fcea1b598e6123786a05532a6f5b2e10a4095
+  data.tar.gz: ba2dc18a6076e425847b440c05819e898f0a66b2
 SHA512:
-  metadata.gz: a0938a895f630f06740ecb2d947adccb3f6384ad1f7c06316bde43d60b750fae76fda227138c5827c7da404590b96102feb8089e5fb83d056c180a313a4b17e1
-  data.tar.gz: 14cc7b7da521649b5713fb6932a9d201962fbf5bf9df2f09569066baaa29bdffc8edcc89e7903d2aceb7eb6df1c3f05ebff4b27623dc8a427cfaba0ddf4df8f1
+  metadata.gz: 28f60b98574e5331b53280f27017fae776c787ee1b7a56815c8a8f9c21a0926e6f561ca8a75f1464f1743849f989a52f122fbe4f20086de8159cf2df53b71bbe
+  data.tar.gz: 6d5b80b11e4774bc227bffe62bc829ab19b70f22fd69ca35c54526b85f261fa5c4bf0a7d87c9ba715738f50a6710bdd843f3b6cc1581f0d88744332fdf062796

data/Manifest.txt CHANGED Viewed

@@ -8,5 +8,6 @@ lib/csvreader/reader.rb
 lib/csvreader/version.rb
 test/data/beer.csv
 test/data/beer11.csv
+test/data/shakespeare.csv
 test/helper.rb
 test/test_reader.rb

data/README.md CHANGED Viewed

@@ -11,6 +11,179 @@
 ## Usage
+``` ruby
+line = "1,2,3"
+values = CsvReader.parse_line( line )
+pp values
+# => ["1","2","3"]
+```
+or use the convenience helpers:
+``` ruby
+txt <<=TXT
+1,2,3
+4,5,6
+TXT
+records = CsvReader.parse( txt )
+pp records
+# => [["1","2","3"],
+#     ["5","6","7"]]
+# -or-
+records = CsvReader.read( "values.csv" )
+pp records
+# => [["1","2","3"],
+#     ["5","6","7"]]
+# -or-
+CsvReader.foreach( "values.csv" ) do |rec|
+  pp rec
+end
+# => ["1","2","3"]
+# => ["5","6","7"]
+```
+### What about headers?
+Use the `CsvHashReader`
+if the first line is a header (or if missing pass in the headers
+as an array) and you want your records as hashes instead of arrays of strings.
+Example:
+``` ruby
+txt <<=TXT
+A,B,C
+1,2,3
+4,5,6
+TXT
+records = CsvHashReader.parse( txt )
+pp records
+# -or-
+txt2 <<=TXT
+1,2,3
+4,5,6
+TXT
+records = CsvHashReader.parse( txt2, headers: ["A","B","C"] )
+pp records
+# => [{"A": "1", "B": "2", "C": "3"},
+#     {"A": "4", "B": "5", "C": "6"}]
+# -or-
+records = CsvHashReader.read( "hash.csv" )
+pp records
+# => [{"A": "1", "B": "2", "C": "3"},
+#     {"A": "4", "B": "5", "C": "6"}]
+# -or-
+CsvHashReader.foreach( "hash.csv" ) do |rec|
+  pp rec
+end
+# => {"A": "1", "B": "2", "C": "3"}
+# => {"A": "4", "B": "5", "C": "6"}
+```
+## Frequently Asked Questions (FAQ) and Answers
+### Q: What's CSV the right way? What best practices can I use?
+Use best practices out-of-the-box with zero-configuration.
+Do you know how to skip blank lines or how to add `#` single-line comments?
+Or how to trim leading and trailing spaces?  No worries. It's turned on by default.
+Yes, you can. Use
+```
+#######
+# try with some comments
+#   and blank lines even before header (first row)
+Brewery,City,Name,Abv
+Andechser Klosterbrauerei,Andechs,Doppelbock Dunkel,7%
+Augustiner Bräu München,München,Edelstoff,5.6%
+Bayerische Staatsbrauerei Weihenstephan,  Freising,  Hefe Weissbier,   5.4%
+Brauerei Spezial,                         Bamberg,   Rauchbier Märzen, 5.1%
+Hacker-Pschorr Bräu,                      München,   Münchner Dunkel,  5.0%
+Staatliches Hofbräuhaus München,          München,   Hofbräu Oktoberfestbier, 6.3%
+```
+instead of strict "classic"
+(no blank lines, no comments, no leading and trailing spaces, etc.):
+```
+Brewery,City,Name,Abv
+Andechser Klosterbrauerei,Andechs,Doppelbock Dunkel,7%
+Augustiner Bräu München,München,Edelstoff,5.6%
+Bayerische Staatsbrauerei Weihenstephan,Freising,Hefe Weissbier,5.4%
+Brauerei Spezial,Bamberg,Rauchbier Märzen,5.1%
+Hacker-Pschorr Bräu,München,Münchner Dunkel,5.0%
+Staatliches Hofbräuhaus München,München,Hofbräu Oktoberfestbier,6.3%
+```
+### Q: How can I change the separator to semicolon (`;`) or pipe (`|`)?
+Pass in the `sep` keyword option. Example:
+``` ruby
+CsvReader.parse_line( ..., sep: ';' )
+CsvReader.parse( ..., sep: ';' )
+CsvReader.read( ..., sep: ';' )
+# ...
+CsvReader.parse_line( ..., sep: '|' )
+CsvReader.parse( ..., sep: '|' )
+CsvReader.read( ..., sep: '|' )
+# ...
+# and so on
+```
+Note: If you use tab (`\t`) use the `TabReader`! Why? Tab =! CSV. Yes, tab is
+its own (even) simpler format
+(e.g. no escape rules, no newlines in values, etc.),
+see [`TabReader` »](https://github.com/datatxt/tabreader).
+### Q: What's broken in the standard library CSV reader?
+Two major design bugs and many many minor.
+1) The CSV class uses `line.split(`,`)` with some kludges (†) with the claim its faster.
+What?! The right way: CSV needs its own purpose-built parser. There's no other
+way you can handle all the (edge) cases with double quotes and escaped doubled up
+double quotes. Period.
+For example, the CSV class cannot handle leading or trailing spaces
+for double quoted values `1,•"2","3"•`.
+Or handling double quotes inside values and so on and on.
+(†): kludge - a workaround or quick-and-dirty solution that is clumsy, inelegant, inefficient, difficult to extend and hard to maintain
+2) The CSV class returns `nil` for `,,` but an empty string (`""`)
+for `"","",""`. The right way: All values are always strings. Period.
+If you want to use `nil` you MUST configure a string (or strings)
+such as `NA`, `n/a`, `\N`, or similar that map to `nil`.
 ## Alternatives

data/lib/csvreader/reader.rb CHANGED Viewed

@@ -96,21 +96,7 @@ end   # module Csvv
 class CsvReader
-  ####################
-  # helper methods
-  def self.unwrap( row_or_array )   ## unwrap row - find a better name? why? why not?
-    ## return row values as array of strings
-    if row_or_array.is_a?( CSV::Row )
-      row = row_or_array
-      row.fields   ## gets array of string of field values
-    else  ## assume "classic" array of strings
-      array = row_or_array
-    end
-  end
-  def self.foreach( path, sep: Csv.config.sep, headers: true )
+  def self.foreach( path, sep: Csv.config.sep, headers: false )
     csv_options = Csv.config.default_options.merge(
                      headers: headers,
                      col_sep: sep,
@@ -122,8 +108,7 @@ class CsvReader
     end
   end
-  def self.read( path, sep: Csv.config.sep, headers: true )
+  def self.read( path, sep: Csv.config.sep, headers: false )
     ## note: use our own file.open
     ##   always use utf-8 for now
     ##    check/todo: add skip option bom too - why? why not?
@@ -131,7 +116,7 @@ class CsvReader
     parse( txt, sep: sep, headers: headers )
   end
-  def self.parse( txt, sep: Csv.config.sep, headers: true )
+  def self.parse( txt, sep: Csv.config.sep, headers: false )
     csv_options = Csv.config.default_options.merge(
                      headers: headers,
                      col_sep: sep
@@ -140,6 +125,7 @@ class CsvReader
     CSV.parse( txt, csv_options )
   end
   def self.parse_line( txt, sep: Csv.config.sep )
     ## note: do NOT include headers option (otherwise single row gets skipped as first header row :-)
     csv_options = Csv.config.default_options.merge(
@@ -151,7 +137,6 @@ class CsvReader
   end
   def self.header( path, sep: Csv.config.sep )   ## use header or headers - or use both (with alias)?
       # read first lines (only)
       #  and parse with csv to get header from csv library itself
@@ -185,4 +170,38 @@ class CsvReader
       ##   hash record does NOT work for single line/row
       parse_line( lines, sep: sep )
     end  # method self.header
+    ####################
+    # helper methods
+    def self.unwrap( row_or_array )   ## unwrap row - find a better name? why? why not?
+      ## return row values as array of strings
+      if row_or_array.is_a?( CSV::Row )
+        row = row_or_array
+        row.fields   ## gets array of string of field values
+      else  ## assume "classic" array of strings
+        array = row_or_array
+      end
+    end
 end # class CsvReader
+class CsvHashReader
+def self.read( path, sep: Csv.config.sep, headers: true )
+  CsvReader.read( path, sep: sep, headers: headers )
+end
+def self.parse( txt, sep: Csv.config.sep, headers: true )
+  CsvReader.parse( txt, sep: sep, headers: headers )
+end
+def self.foreach( path, sep: Csv.config.sep, headers: true, &block )
+  CsvReader.foreach( path, sep: sep, headers: headers, &block )
+end
+def self.header( path, sep: Csv.config.sep )   ## add header too? why? why not?
+  CsvReader.header( path, sep: sep )
+end
+end # class CsvHashReader

data/lib/csvreader/version.rb CHANGED Viewed

@@ -4,7 +4,7 @@
 class CsvReader   ## note: uses a class for now - change to module - why? why not?
   MAJOR = 0    ## todo: namespace inside version or something - why? why not??
-  MINOR = 1
+  MINOR = 2
   PATCH = 0
   VERSION = [MAJOR,MINOR,PATCH].join('.')

data/test/data/shakespeare.csv ADDED Viewed

@@ -0,0 +1,9 @@
+Quote,Play,Cite
+Sweet are the uses of adversity,As You Like It,"Act 2, scene 1, 12"
+All the world's a stage,As You Like It,"Act 2, scene 7, 139"
+"We few, we happy few",Henry V,
+"""Seems,"" madam! Nay it is; I know not ""seems.""",Hamlet,(1.ii.76)
+"To be, or not to be",Hamlet,"Act 3, scene 1, 55"
+What's in a name? That which we call a rose by any other name would smell as sweet.,Romeo and Juliet,"(II, ii, 1-2)"
+"O Romeo, Romeo, wherefore art thou Romeo?",Romeo and Juliet,"Act 2, scene 2, 33"
+"Tomorrow, and tomorrow, and tomorrow",Macbeth,"Act 5, scene 5, 19"

data/test/test_reader.rb CHANGED Viewed

@@ -9,39 +9,40 @@ require 'helper'
 class TestReader < MiniTest::Test
 def test_read
   puts "== read: beer.csv:"
-  table = CsvReader.read( "#{CsvReader.test_data_dir}/beer.csv" )   ## returns CSV::Table
+  data = CsvReader.read( "#{CsvReader.test_data_dir}/beer.csv" )
-  pp table.class.name
-  pp table
-  pp table.to_a   ## note: includes header (first row with column names)
+  pp data.class.name
+  pp data
-  table.each do |row|   ## note: will skip (NOT include) header row!!
+  data.each do |row|
     pp row
   end
-  puts "  #{table.size} rows"  ## note: again will skip (NOT include) header row in count!!!
-  assert_equal 6, table.size
+  puts "  #{data.size} rows"
+  assert_equal 7, data.size   ## note: include header row in count
 end
-def test_read_header_false
-  puts "== read (headers: false): beer.csv:"
-  data = CsvReader.read( "#{CsvReader.test_data_dir}/beer.csv", headers: false )
+def test_read_hash
+  puts "== read (hash): beer.csv:"
+  table = CsvHashReader.read( "#{CsvReader.test_data_dir}/beer.csv" )   ## returns CSV::Table
-  pp data.class.name
-  pp data
+  pp table.class.name
+  pp table
+  pp table.to_a   ## note: includes header (first row with column names)
-  data.each do |row|
+  table.each do |row|   ## note: will skip (NOT include) header row!!
     pp row
   end
-  puts "  #{data.size} rows"
-  assert_equal 7, data.size   ## note: include header row in count
+  puts "  #{table.size} rows"  ## note: again will skip (NOT include) header row in count!!!
+  assert_equal 6, table.size
 end
-def test_read11
-  puts "== read: beer11.csv:"
-  table = CsvReader.read( "#{CsvReader.test_data_dir}/beer11.csv" )
+def test_read_hash11
+  puts "== read (hash): beer11.csv:"
+  table = CsvHashReader.read( "#{CsvReader.test_data_dir}/beer11.csv" )
   pp table
   pp table.to_a   ## note: includes header (first row with column names)
@@ -90,28 +91,29 @@ def test_header11
 end
 def test_foreach
-  puts "== foreach: beer.csv:"
-  CsvReader.foreach( "#{CsvReader.test_data_dir}/beer.csv" ) do |row|
-    pp row
-    pp row.fields
+  puts "== foreach: beer11.csv:"
+  CsvReader.foreach( "#{CsvReader.test_data_dir}/beer11.csv" ) do |row|
+    pp row      ## note: is Array (no .fields available!!!!!)
   end
   assert true
 end
-def test_foreach11
-  puts "== foreach: beer11.csv:"
-  CsvReader.foreach( "#{CsvReader.test_data_dir}/beer11.csv" ) do |row|
+def test_foreach_hash
+  puts "== foreach (hash): beer.csv:"
+  CsvHashReader.foreach( "#{CsvReader.test_data_dir}/beer.csv" ) do |row|
     pp row
     pp row.fields
   end
   assert true
 end
-def test_foreach_header_false
-  puts "== foreach (headers: false): beer11.csv:"
-  CsvReader.foreach( "#{CsvReader.test_data_dir}/beer11.csv", headers: false ) do |row|
-    pp row      ## note: is Array (no .fields available!!!!!)
+def test_foreach_hash11
+  puts "== foreach (hash): beer11.csv:"
+  CsvHashReader.foreach( "#{CsvReader.test_data_dir}/beer11.csv" ) do |row|
+    pp row
+    pp row.fields
   end
   assert true
 end

metadata CHANGED Viewed

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: csvreader
 version: !ruby/object:Gem::Version
-  version: 0.1.0
+  version: 0.2.0
 platform: ruby
 authors:
 - Gerald Bauer
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2018-08-15 00:00:00.000000000 Z
+date: 2018-08-19 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: rdoc
@@ -59,6 +59,7 @@ files:
 - lib/csvreader/version.rb
 - test/data/beer.csv
 - test/data/beer11.csv
+- test/data/shakespeare.csv
 - test/helper.rb
 - test/test_reader.rb
 homepage: https://github.com/csv11/csvreader