csvjson 0.0.1 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 56559ebdff3af27b8342110f2086c854a858ba99
4
- data.tar.gz: 82cb4535856f56c62be126da74655bddf54dabd1
3
+ metadata.gz: d79afe20fafe225a50aa9a270e6bee240b47566a
4
+ data.tar.gz: e62c8fc7c94431e8b4756535f93221b743dfac08
5
5
  SHA512:
6
- metadata.gz: 40516df39243045d71dd4a150bfed56c40c2039e7381bfae558ab125cf6207f7b4b3094ab9041d0d79ac58931d6d225b2aac9fae1d76094ffe7dd3d7076ad432
7
- data.tar.gz: 5a7cd3c63e571772634e6485b15e40aef4011fd30336a88e540111104a8a55b1d57b14b5167185eb5a432bc505451b522d4dfd23520924ae4025fb66d2245b93
6
+ metadata.gz: d642bd3599dd04e15abd6d4accc5f2c63106af5158ac4876d79f0583c689e33013b05adb762e668db757028fcaed435e248b5b25703445b4379d5e058b5f4fce
7
+ data.tar.gz: 671a42c6e0f4245142d691ad9c0b9b596eb6754c2d543a1c95b6d52055b274c14321e1992d311778907c7fd6a443440911560a6dd6b134d73e2bd666a859c36b
data/Manifest.txt CHANGED
@@ -3,4 +3,10 @@ Manifest.txt
3
3
  README.md
4
4
  Rakefile
5
5
  lib/csvjson.rb
6
+ lib/csvjson/parser.rb
6
7
  lib/csvjson/version.rb
8
+ test/data/hello.json.csv
9
+ test/data/hello11.json.csv
10
+ test/helper.rb
11
+ test/test_parser.rb
12
+ test/test_parser_misc.rb
data/README.md CHANGED
@@ -1,3 +1,155 @@
1
- # csvjson
1
+ # CSV <3 JSON Parser / Reader
2
2
 
3
3
  csvjson library / gem - read tabular data in the CSV <3 JSON format, that is, comma-separated values CSV (line-by-line) records with javascript object notation (JSON) encoding rules
4
+
5
+ * home :: [github.com/csv11/csvjson](https://github.com/csv11/csvjson)
6
+ * bugs :: [github.com/csv11/csvjson/issues](https://github.com/csv11/csvjson/issues)
7
+ * gem :: [rubygems.org/gems/csvjson](https://rubygems.org/gems/csvjson)
8
+ * rdoc :: [rubydoc.info/gems/csvjson](http://rubydoc.info/gems/csvjson)
9
+ * forum :: [wwwmake](http://groups.google.com/group/wwwmake)
10
+
11
+
12
+ ## What's CSV <3 JSON?
13
+
14
+ CSV <3 JSON is a Comma-Separated Values (CSV)
15
+ variant / format / dialect
16
+ where the line-by-line records follow the
17
+ JavaScript Object Notation (JSON) encoding rules.
18
+ It's a modern (simple) tabular data format that
19
+ includes arrays, numbers, booleans, nulls, nested structures, comments and more.
20
+ Example:
21
+
22
+
23
+ ```
24
+ # "Vanilla" CSV <3 JSON
25
+
26
+ 1,"John","12 Totem Rd. Aspen",true
27
+ 2,"Bob",null,false
28
+ 3,"Sue","Bigsby, 345 Carnival, WA 23009",false
29
+ ```
30
+
31
+ or
32
+
33
+ ```
34
+ # CSV <3 JSON with array values
35
+
36
+ 1,"directions",["north","south","east","west"]
37
+ 2,"colors",["red","green","blue"]
38
+ 3,"drinks",["soda","water","tea","coffe"]
39
+ 4,"spells",[]
40
+ ```
41
+
42
+ For more see the [official CSV <3 JSON Format documentation »](https://github.com/csv11/csv-json)
43
+
44
+
45
+
46
+ ## Usage
47
+
48
+ ``` ruby
49
+ txt <<=TXT
50
+ # "Vanilla" CSV <3 JSON
51
+
52
+ 1,"John","12 Totem Rd. Aspen",true
53
+ 2,"Bob",null,false
54
+ 3,"Sue","Bigsby, 345 Carnival, WA 23009",false
55
+ TXT
56
+
57
+ records = CsvJson.parse( txt ) ## or CSV_JSON.parse or CSVJSON.parse
58
+ pp records
59
+ # => [[1,"John","12 Totem Rd. Aspen",true],
60
+ # [2,"Bob",nil,false],
61
+ # [3,"Sue","Bigsby, 345 Carnival, WA 23009",false]]
62
+
63
+ # -or-
64
+
65
+ records = CsvJson.read( "values.json.csv" ) ## or CSV_JSON.read or CSVJSON.read
66
+ pp records
67
+ # => [[1,"John","12 Totem Rd. Aspen",true],
68
+ # [2,"Bob",nil,false],
69
+ # [3,"Sue","Bigsby, 345 Carnival, WA 23009",false]]
70
+
71
+ # -or-
72
+
73
+ CsvJson.foreach( "values.json.csv" ) do |rec| ## or CSV_JSON.foreach or CSVJSON.foreach
74
+ pp rec
75
+ end
76
+ # => [1,"John","12 Totem Rd. Aspen",true]
77
+ # => [2,"Bob",nil,false]
78
+ # => [3,"Sue","Bigsby, 345 Carnival, WA 23009",false]
79
+ ```
80
+
81
+
82
+
83
+ ### What about Enumerable?
84
+
85
+ Yes, the reader / parser includes `Enumerable` and runs on `each`.
86
+ Use `new` or `open` without a block
87
+ to get the enumerator (iterator).
88
+ Example:
89
+
90
+
91
+ ``` ruby
92
+ csv = CsvJson.new( "1,2,3" ) ## or CSV_JSON.new or CSVJSON.new
93
+ it = csv.to_enum
94
+ pp it.next
95
+ # => [1,2,3]
96
+
97
+ # -or-
98
+
99
+ csv = CsvJson.open( "values.json.csv" ) ## or CSV_JSON.open or CSVJSON.open
100
+ it = csv.to_enum
101
+ pp it.next
102
+ # => [1,"John","12 Totem Rd. Aspen",true]
103
+ pp it.next
104
+ # => [2,"Bob",nil,false]
105
+ ```
106
+
107
+
108
+
109
+ ### What about headers?
110
+
111
+ Yes, you can. Use the `CsvHash`
112
+ from the csvreader library / gem
113
+ if the first line is a header (or if missing pass in the headers
114
+ as an array) and you want your records as hashes instead of arrays of strings.
115
+ Example:
116
+
117
+ ``` ruby
118
+ txt <<=TXT
119
+ "id","name","address","regular"
120
+ 1,"John","12 Totem Rd. Aspen",true
121
+ 2,"Bob",null,false
122
+ 3,"Sue","Bigsby, 345 Carnival, WA 23009",false
123
+ TXT
124
+
125
+ records = CsvHash.json.parse( txt )
126
+ pp records
127
+
128
+ # => [{"id": 1,
129
+ # "name": "John",
130
+ # "address": "12 Totem Rd. Aspen",
131
+ # "regular": true},
132
+ # {"id": 2,
133
+ # "name": "Bob",
134
+ # "address": null,
135
+ # "regular": false},
136
+ # ... ]
137
+ ```
138
+
139
+ For more see the [official CsvHash documentation in the csvreader library / gem »](https://github.com/csv11/csvreader)
140
+
141
+
142
+
143
+
144
+
145
+
146
+ ## License
147
+
148
+ The `csvjson` scripts are dedicated to the public domain.
149
+ Use it as you please with no restrictions whatsoever.
150
+
151
+
152
+ ## Questions? Comments?
153
+
154
+ Send them along to the [wwwmake forum](http://groups.google.com/group/wwwmake).
155
+ Thanks!
data/lib/csvjson.rb CHANGED
@@ -1,11 +1,22 @@
1
1
  # encoding: utf-8
2
2
 
3
3
  require 'pp'
4
+ require 'json'
5
+ require 'logger'
4
6
 
5
7
 
6
8
  ## our own code
9
+ ## todo/check: use require_relative - why? why not?
7
10
  require 'csvjson/version' # note: let version always go first
11
+ require 'csvjson/parser'
8
12
 
9
13
 
10
- pp CsvJson.banner
11
- pp CsvJson.root
14
+ ## add some "alternative" shortcut aliases
15
+ CSV_JSON = CsvJson
16
+ CSVJSON = CsvJson
17
+ CSVJ = CsvJson
18
+ CsvJ = CsvJson
19
+
20
+
21
+
22
+ puts CsvJson.banner
@@ -0,0 +1,131 @@
1
+ # encoding: utf-8
2
+
3
+
4
+ class CsvJson
5
+
6
+ ###################################
7
+ ## add simple logger with debug flag/switch
8
+ #
9
+ # use Parser.debug = true # to turn on
10
+ #
11
+ # todo/fix: use logutils instead of std logger - why? why not?
12
+
13
+ def self.build_logger()
14
+ l = Logger.new( STDOUT )
15
+ l.level = :info ## set to :info on start; note: is 0 (debug) by default
16
+ l
17
+ end
18
+ def self.logger() @@logger ||= build_logger; end
19
+ def logger() self.class.logger; end
20
+
21
+
22
+
23
+
24
+ def self.open( path, mode=nil, &block ) ## rename path to filename or name - why? why not?
25
+
26
+ ## note: default mode (if nil/not passed in) to 'r:bom|utf-8'
27
+ f = File.open( path, mode ? mode : 'r:bom|utf-8' )
28
+ csv = new( f )
29
+
30
+ # handle blocks like Ruby's open()
31
+ if block_given?
32
+ begin
33
+ block.call( csv )
34
+ ensure
35
+ csv.close
36
+ end
37
+ else
38
+ csv
39
+ end
40
+ end # method self.open
41
+
42
+
43
+ def self.read( path )
44
+ open( path ) { |csv| csv.read }
45
+ end
46
+
47
+
48
+ def self.foreach( path, &block )
49
+ csv = open( path )
50
+
51
+ if block_given?
52
+ begin
53
+ csv.each( &block )
54
+ ensure
55
+ csv.close
56
+ end
57
+ else
58
+ csv.to_enum ## note: caller (responsible) must close file!!!
59
+ ## remove version without block given - why? why not?
60
+ ## use Csv.open().to_enum or Csv.open().each
61
+ ## or Csv.new( File.new() ).to_enum or Csv.new( File.new() ).each ???
62
+ end
63
+ end # method self.foreach
64
+
65
+
66
+ def self.parse( data, &block )
67
+ csv = new( data )
68
+
69
+ if block_given?
70
+ csv.each( &block ) ## note: caller (responsible) must close file!!! - add autoclose - why? why not?
71
+ else # slurp contents, if no block is given
72
+ csv.read ## note: caller (responsible) must close file!!! - add autoclose - why? why not?
73
+ end
74
+ end # method self.parse
75
+
76
+
77
+
78
+ def initialize( data )
79
+ if data.is_a?( String )
80
+ @input = data # note: just needs each for each_line
81
+ else ## assume io
82
+ @input = data
83
+ end
84
+ end
85
+
86
+
87
+
88
+ include Enumerable
89
+
90
+ def each( &block )
91
+ if block_given?
92
+ @input.each_line do |line|
93
+
94
+ logger.debug "line:" if logger.debug?
95
+ logger.debug line.pretty_inspect if logger.debug?
96
+
97
+
98
+ ## note: chomp('') if is an empty string,
99
+ ## it will remove all trailing newlines from the string.
100
+ ## use line.sub(/[\n\r]*$/, '') or similar instead - why? why not?
101
+ line = line.chomp( '' )
102
+ line = line.strip ## strip leading and trailing whitespaces (space/tab) too
103
+ logger.debug line.pretty_inspect if logger.debug?
104
+
105
+ if line.empty? ## skip blank lines
106
+ logger.debug "skip blank line" if logger.debug?
107
+ next
108
+ end
109
+
110
+ if line.start_with?( "#" ) ## skip comment lines
111
+ logger.debug "skip comment line" if logger.debug?
112
+ next
113
+ end
114
+
115
+ ## note: auto-wrap in array e.g. with []
116
+ json = JSON.parse( "[#{line}]" )
117
+ logger.debug json.pretty_inspect if logger.debug?
118
+ block.call( json )
119
+ end
120
+ else
121
+ to_enum
122
+ end
123
+ end # method each
124
+
125
+ def read() to_a; end # method read
126
+
127
+ def close
128
+ @input.close if @input.respond_to?(:close) ## note: string needs no close
129
+ end
130
+
131
+ end # class CsvJson
@@ -4,9 +4,9 @@
4
4
 
5
5
  class CsvJson
6
6
 
7
- MAJOR = 0
7
+ MAJOR = 1
8
8
  MINOR = 0
9
- PATCH = 1
9
+ PATCH = 0
10
10
  VERSION = [MAJOR,MINOR,PATCH].join('.')
11
11
 
12
12
  def self.version
@@ -0,0 +1,3 @@
1
+ 1,"John","12 Totem Rd. Aspen",true
2
+ 2,"Bob",null,false
3
+ 3,"Sue","Bigsby, 345 Carnival, WA 23009",false
@@ -0,0 +1,5 @@
1
+ # hello world
2
+
3
+ 1, "John", "12 Totem Rd. Aspen", true
4
+ 2, "Bob", null, false
5
+ 3, "Sue", "Bigsby, 345 Carnival, WA 23009", false
data/test/helper.rb ADDED
@@ -0,0 +1,20 @@
1
+ ## $:.unshift(File.dirname(__FILE__))
2
+
3
+ ## minitest setup
4
+
5
+ require 'minitest/autorun'
6
+
7
+
8
+ ## our own code
9
+ require 'csvjson'
10
+
11
+
12
+ ## add test_data_dir helper
13
+ class CsvJson
14
+ def self.test_data_dir
15
+ "#{root}/test/data"
16
+ end
17
+ end
18
+
19
+
20
+ CsvJson.logger.level = :debug ## turn on "global" logging
@@ -0,0 +1,104 @@
1
+ # encoding: utf-8
2
+
3
+ ###
4
+ # to run use
5
+ # ruby -I ./lib -I ./test test/test_parser.rb
6
+
7
+
8
+ require 'helper'
9
+
10
+ class TestParser < MiniTest::Test
11
+
12
+
13
+ def parser
14
+ CsvJson
15
+ end
16
+
17
+ def records ## "standard" records for testing
18
+ [[1, "John", "12 Totem Rd. Aspen", true],
19
+ [2, "Bob", nil, false],
20
+ [3, "Sue", "Bigsby, 345 Carnival, WA 23009", false]]
21
+ end
22
+
23
+
24
+
25
+ def test_parse
26
+ assert_equal records, parser.parse( <<TXT )
27
+ 1,"John","12 Totem Rd. Aspen",true
28
+ 2,"Bob",null,false
29
+ 3,"Sue","Bigsby, 345 Carnival, WA 23009",false
30
+ TXT
31
+
32
+ assert_equal records, parser.parse( <<TXT )
33
+ # hello world
34
+
35
+ 1,"John","12 Totem Rd. Aspen",true
36
+ 2,"Bob",null,false
37
+ 3,"Sue","Bigsby, 345 Carnival, WA 23009",false
38
+ TXT
39
+
40
+ assert_equal records, parser.parse( <<TXT )
41
+ # hello world (pretty printed)
42
+
43
+ 1, "John", "12 Totem Rd. Aspen", true
44
+ 2, "Bob", null, false
45
+ 3, "Sue", "Bigsby, 345 Carnival, WA 23009", false
46
+
47
+ # try more comments and empty lines
48
+
49
+ TXT
50
+
51
+
52
+ txt =<<TXT
53
+ # hello world
54
+
55
+ 1,"John","12 Totem Rd. Aspen",true
56
+ 2,"Bob",null,false
57
+ 3,"Sue","Bigsby, 345 Carnival, WA 23009",false
58
+ TXT
59
+
60
+ recs = []
61
+ parser.parse( txt ) { |rec| recs << rec }
62
+ assert_equal records, recs
63
+ end
64
+
65
+
66
+ def test_read
67
+ assert_equal records, parser.read( "#{CsvJson.test_data_dir}/hello.json.csv" )
68
+ assert_equal records, parser.read( "#{CsvJson.test_data_dir}/hello11.json.csv" )
69
+ end
70
+
71
+
72
+ def test_open
73
+ assert_equal records, parser.open( "#{CsvJson.test_data_dir}/hello.json.csv", "r:bom|utf-8" ).read
74
+ assert_equal records, parser.open( "#{CsvJson.test_data_dir}/hello11.json.csv", "r:bom|utf-8" ).read
75
+ end
76
+
77
+
78
+ def test_foreach
79
+ recs = []
80
+ parser.foreach( "#{CsvJson.test_data_dir}/hello.json.csv" ) { |rec| recs << rec }
81
+ assert_equal records, recs
82
+
83
+ recs = []
84
+ parser.foreach( "#{CsvJson.test_data_dir}/hello11.json.csv" ) { |rec| recs << rec }
85
+ assert_equal records, recs
86
+ end
87
+
88
+
89
+ def test_enum
90
+ csv = CsvJson.new( <<TXT )
91
+ # hello world
92
+
93
+ 1,"John","12 Totem Rd. Aspen",true
94
+ 2,"Bob",null,false
95
+ 3,"Sue","Bigsby, 345 Carnival, WA 23009",false
96
+ TXT
97
+
98
+ it = csv.to_enum
99
+ assert_equal [1, "John", "12 Totem Rd. Aspen", true], it.next
100
+ assert_equal [2, "Bob", nil, false], it.next
101
+ assert_equal [3, "Sue", "Bigsby, 345 Carnival, WA 23009", false], it.next
102
+ end
103
+
104
+ end # class TestParser
@@ -0,0 +1,80 @@
1
+ # encoding: utf-8
2
+
3
+ ###
4
+ # to run use
5
+ # ruby -I ./lib -I ./test test/test_parser_misc.rb
6
+
7
+
8
+ require 'helper'
9
+
10
+ class TestParserMisc < MiniTest::Test
11
+
12
+
13
+ def parser
14
+ CsvJson
15
+ end
16
+
17
+
18
+ def test_quotes_and_commas
19
+ assert_equal [
20
+ [1, "John", "12 Totem Rd., Aspen", true],
21
+ [2, "Bob", nil, false],
22
+ [3, "Sue", "\"Bigsby\", 345 Carnival, WA 23009", false]
23
+ ], parser.parse( <<TXT )
24
+ 1,"John","12 Totem Rd., Aspen",true
25
+ 2,"Bob",null,false
26
+ 3,"Sue","\\"Bigsby\\", 345 Carnival, WA 23009",false
27
+ TXT
28
+ end
29
+
30
+
31
+ def test_arrays
32
+ assert_equal [
33
+ [1, "directions", ["north","south","east","west"]],
34
+ [2, "colors", ["red","green","blue"]],
35
+ [3, "drinks", ["soda","water","tea","coffe"]],
36
+ [4, "spells", []],
37
+ ], parser.parse( <<TXT )
38
+ # CSV <3 JSON with array values
39
+
40
+ 1,"directions",["north","south","east","west"]
41
+ 2,"colors",["red","green","blue"]
42
+ 3,"drinks",["soda","water","tea","coffe"]
43
+ 4,"spells",[]
44
+ TXT
45
+ end
46
+
47
+ def test_misc
48
+ ## note:
49
+ ## in the csv <3 json source text backslash needs to get doubled / escaped twice e.g.
50
+ ## \\" for quotes
51
+ ## \\n for newlines and so on
52
+
53
+ assert_equal [
54
+ ["index", "value1", "value2"],
55
+ ["number", 1, 2],
56
+ ["boolean", false, true],
57
+ ["null", nil, "non null"],
58
+ ["array of numbers", [1], [1,2]],
59
+ ["simple object", {"a" => 1}, {"a" => 1, "b" => 2}],
60
+ ["array with mixed objects", [1, nil,"ball"], [2,{"a" => 10, "b" => 20},"cube"]],
61
+ ["string with quotes", "a\"b", "alert(\"Hi!\")"],
62
+ ["string with bell&newlines","bell is \u0007","multi\nline\ntext"]
63
+ ], parser.parse( <<TXT )
64
+ # CSV with all kinds of values
65
+
66
+ "index","value1","value2"
67
+ "number",1,2
68
+ "boolean",false,true
69
+ "null",null,"non null"
70
+ "array of numbers",[1],[1,2]
71
+ "simple object",{"a": 1},{"a":1, "b":2}
72
+ "array with mixed objects",[1,null,"ball"],[2,{"a": 10, "b": 20},"cube"]
73
+ "string with quotes","a\\"b","alert(\\"Hi!\\")"
74
+ "string with bell&newlines","bell is \\u0007","multi\\nline\\ntext"
75
+ TXT
76
+
77
+ end
78
+
79
+
80
+ end # class TestParserMisc
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: csvjson
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.1
4
+ version: 1.0.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Gerald Bauer
@@ -54,7 +54,13 @@ files:
54
54
  - README.md
55
55
  - Rakefile
56
56
  - lib/csvjson.rb
57
+ - lib/csvjson/parser.rb
57
58
  - lib/csvjson/version.rb
59
+ - test/data/hello.json.csv
60
+ - test/data/hello11.json.csv
61
+ - test/helper.rb
62
+ - test/test_parser.rb
63
+ - test/test_parser_misc.rb
58
64
  homepage: https://github.com/csv11/csvjson
59
65
  licenses:
60
66
  - Public Domain