csvjson 0.0.1 → 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 56559ebdff3af27b8342110f2086c854a858ba99
4
- data.tar.gz: 82cb4535856f56c62be126da74655bddf54dabd1
3
+ metadata.gz: d79afe20fafe225a50aa9a270e6bee240b47566a
4
+ data.tar.gz: e62c8fc7c94431e8b4756535f93221b743dfac08
5
5
  SHA512:
6
- metadata.gz: 40516df39243045d71dd4a150bfed56c40c2039e7381bfae558ab125cf6207f7b4b3094ab9041d0d79ac58931d6d225b2aac9fae1d76094ffe7dd3d7076ad432
7
- data.tar.gz: 5a7cd3c63e571772634e6485b15e40aef4011fd30336a88e540111104a8a55b1d57b14b5167185eb5a432bc505451b522d4dfd23520924ae4025fb66d2245b93
6
+ metadata.gz: d642bd3599dd04e15abd6d4accc5f2c63106af5158ac4876d79f0583c689e33013b05adb762e668db757028fcaed435e248b5b25703445b4379d5e058b5f4fce
7
+ data.tar.gz: 671a42c6e0f4245142d691ad9c0b9b596eb6754c2d543a1c95b6d52055b274c14321e1992d311778907c7fd6a443440911560a6dd6b134d73e2bd666a859c36b
data/Manifest.txt CHANGED
@@ -3,4 +3,10 @@ Manifest.txt
3
3
  README.md
4
4
  Rakefile
5
5
  lib/csvjson.rb
6
+ lib/csvjson/parser.rb
6
7
  lib/csvjson/version.rb
8
+ test/data/hello.json.csv
9
+ test/data/hello11.json.csv
10
+ test/helper.rb
11
+ test/test_parser.rb
12
+ test/test_parser_misc.rb
data/README.md CHANGED
@@ -1,3 +1,155 @@
1
- # csvjson
1
+ # CSV <3 JSON Parser / Reader
2
2
 
3
3
  csvjson library / gem - read tabular data in the CSV <3 JSON format, that is, comma-separated values CSV (line-by-line) records with javascript object notation (JSON) encoding rules
4
+
5
+ * home :: [github.com/csv11/csvjson](https://github.com/csv11/csvjson)
6
+ * bugs :: [github.com/csv11/csvjson/issues](https://github.com/csv11/csvjson/issues)
7
+ * gem :: [rubygems.org/gems/csvjson](https://rubygems.org/gems/csvjson)
8
+ * rdoc :: [rubydoc.info/gems/csvjson](http://rubydoc.info/gems/csvjson)
9
+ * forum :: [wwwmake](http://groups.google.com/group/wwwmake)
10
+
11
+
12
+ ## What's CSV <3 JSON?
13
+
14
+ CSV <3 JSON is a Comma-Separated Values (CSV)
15
+ variant / format / dialect
16
+ where the line-by-line records follow the
17
+ JavaScript Object Notation (JSON) encoding rules.
18
+ It's a modern (simple) tabular data format that
19
+ includes arrays, numbers, booleans, nulls, nested structures, comments and more.
20
+ Example:
21
+
22
+
23
+ ```
24
+ # "Vanilla" CSV <3 JSON
25
+
26
+ 1,"John","12 Totem Rd. Aspen",true
27
+ 2,"Bob",null,false
28
+ 3,"Sue","Bigsby, 345 Carnival, WA 23009",false
29
+ ```
30
+
31
+ or
32
+
33
+ ```
34
+ # CSV <3 JSON with array values
35
+
36
+ 1,"directions",["north","south","east","west"]
37
+ 2,"colors",["red","green","blue"]
38
+ 3,"drinks",["soda","water","tea","coffe"]
39
+ 4,"spells",[]
40
+ ```
41
+
42
+ For more see the [official CSV <3 JSON Format documentation »](https://github.com/csv11/csv-json)
43
+
44
+
45
+
46
+ ## Usage
47
+
48
+ ``` ruby
49
+ txt <<=TXT
50
+ # "Vanilla" CSV <3 JSON
51
+
52
+ 1,"John","12 Totem Rd. Aspen",true
53
+ 2,"Bob",null,false
54
+ 3,"Sue","Bigsby, 345 Carnival, WA 23009",false
55
+ TXT
56
+
57
+ records = CsvJson.parse( txt ) ## or CSV_JSON.parse or CSVJSON.parse
58
+ pp records
59
+ # => [[1,"John","12 Totem Rd. Aspen",true],
60
+ # [2,"Bob",nil,false],
61
+ # [3,"Sue","Bigsby, 345 Carnival, WA 23009",false]]
62
+
63
+ # -or-
64
+
65
+ records = CsvJson.read( "values.json.csv" ) ## or CSV_JSON.read or CSVJSON.read
66
+ pp records
67
+ # => [[1,"John","12 Totem Rd. Aspen",true],
68
+ # [2,"Bob",nil,false],
69
+ # [3,"Sue","Bigsby, 345 Carnival, WA 23009",false]]
70
+
71
+ # -or-
72
+
73
+ CsvJson.foreach( "values.json.csv" ) do |rec| ## or CSV_JSON.foreach or CSVJSON.foreach
74
+ pp rec
75
+ end
76
+ # => [1,"John","12 Totem Rd. Aspen",true]
77
+ # => [2,"Bob",nil,false]
78
+ # => [3,"Sue","Bigsby, 345 Carnival, WA 23009",false]
79
+ ```
80
+
81
+
82
+
83
+ ### What about Enumerable?
84
+
85
+ Yes, the reader / parser includes `Enumerable` and runs on `each`.
86
+ Use `new` or `open` without a block
87
+ to get the enumerator (iterator).
88
+ Example:
89
+
90
+
91
+ ``` ruby
92
+ csv = CsvJson.new( "1,2,3" ) ## or CSV_JSON.new or CSVJSON.new
93
+ it = csv.to_enum
94
+ pp it.next
95
+ # => [1,2,3]
96
+
97
+ # -or-
98
+
99
+ csv = CsvJson.open( "values.json.csv" ) ## or CSV_JSON.open or CSVJSON.open
100
+ it = csv.to_enum
101
+ pp it.next
102
+ # => [1,"John","12 Totem Rd. Aspen",true]
103
+ pp it.next
104
+ # => [2,"Bob",nil,false]
105
+ ```
106
+
107
+
108
+
109
+ ### What about headers?
110
+
111
+ Yes, you can. Use the `CsvHash`
112
+ from the csvreader library / gem
113
+ if the first line is a header (or if missing pass in the headers
114
+ as an array) and you want your records as hashes instead of arrays of strings.
115
+ Example:
116
+
117
+ ``` ruby
118
+ txt <<=TXT
119
+ "id","name","address","regular"
120
+ 1,"John","12 Totem Rd. Aspen",true
121
+ 2,"Bob",null,false
122
+ 3,"Sue","Bigsby, 345 Carnival, WA 23009",false
123
+ TXT
124
+
125
+ records = CsvHash.json.parse( txt )
126
+ pp records
127
+
128
+ # => [{"id": 1,
129
+ # "name": "John",
130
+ # "address": "12 Totem Rd. Aspen",
131
+ # "regular": true},
132
+ # {"id": 2,
133
+ # "name": "Bob",
134
+ # "address": null,
135
+ # "regular": false},
136
+ # ... ]
137
+ ```
138
+
139
+ For more see the [official CsvHash documentation in the csvreader library / gem »](https://github.com/csv11/csvreader)
140
+
141
+
142
+
143
+
144
+
145
+
146
+ ## License
147
+
148
+ The `csvjson` scripts are dedicated to the public domain.
149
+ Use it as you please with no restrictions whatsoever.
150
+
151
+
152
+ ## Questions? Comments?
153
+
154
+ Send them along to the [wwwmake forum](http://groups.google.com/group/wwwmake).
155
+ Thanks!
data/lib/csvjson.rb CHANGED
@@ -1,11 +1,22 @@
1
1
  # encoding: utf-8
2
2
 
3
3
  require 'pp'
4
+ require 'json'
5
+ require 'logger'
4
6
 
5
7
 
6
8
  ## our own code
9
+ ## todo/check: use require_relative - why? why not?
7
10
  require 'csvjson/version' # note: let version always go first
11
+ require 'csvjson/parser'
8
12
 
9
13
 
10
- pp CsvJson.banner
11
- pp CsvJson.root
14
+ ## add some "alternative" shortcut aliases
15
+ CSV_JSON = CsvJson
16
+ CSVJSON = CsvJson
17
+ CSVJ = CsvJson
18
+ CsvJ = CsvJson
19
+
20
+
21
+
22
+ puts CsvJson.banner
@@ -0,0 +1,131 @@
1
+ # encoding: utf-8
2
+
3
+
4
+ class CsvJson
5
+
6
+ ###################################
7
+ ## add simple logger with debug flag/switch
8
+ #
9
+ # use Parser.debug = true # to turn on
10
+ #
11
+ # todo/fix: use logutils instead of std logger - why? why not?
12
+
13
+ def self.build_logger()
14
+ l = Logger.new( STDOUT )
15
+ l.level = :info ## set to :info on start; note: is 0 (debug) by default
16
+ l
17
+ end
18
+ def self.logger() @@logger ||= build_logger; end
19
+ def logger() self.class.logger; end
20
+
21
+
22
+
23
+
24
+ def self.open( path, mode=nil, &block ) ## rename path to filename or name - why? why not?
25
+
26
+ ## note: default mode (if nil/not passed in) to 'r:bom|utf-8'
27
+ f = File.open( path, mode ? mode : 'r:bom|utf-8' )
28
+ csv = new( f )
29
+
30
+ # handle blocks like Ruby's open()
31
+ if block_given?
32
+ begin
33
+ block.call( csv )
34
+ ensure
35
+ csv.close
36
+ end
37
+ else
38
+ csv
39
+ end
40
+ end # method self.open
41
+
42
+
43
+ def self.read( path )
44
+ open( path ) { |csv| csv.read }
45
+ end
46
+
47
+
48
+ def self.foreach( path, &block )
49
+ csv = open( path )
50
+
51
+ if block_given?
52
+ begin
53
+ csv.each( &block )
54
+ ensure
55
+ csv.close
56
+ end
57
+ else
58
+ csv.to_enum ## note: caller (responsible) must close file!!!
59
+ ## remove version without block given - why? why not?
60
+ ## use Csv.open().to_enum or Csv.open().each
61
+ ## or Csv.new( File.new() ).to_enum or Csv.new( File.new() ).each ???
62
+ end
63
+ end # method self.foreach
64
+
65
+
66
+ def self.parse( data, &block )
67
+ csv = new( data )
68
+
69
+ if block_given?
70
+ csv.each( &block ) ## note: caller (responsible) must close file!!! - add autoclose - why? why not?
71
+ else # slurp contents, if no block is given
72
+ csv.read ## note: caller (responsible) must close file!!! - add autoclose - why? why not?
73
+ end
74
+ end # method self.parse
75
+
76
+
77
+
78
+ def initialize( data )
79
+ if data.is_a?( String )
80
+ @input = data # note: just needs each for each_line
81
+ else ## assume io
82
+ @input = data
83
+ end
84
+ end
85
+
86
+
87
+
88
+ include Enumerable
89
+
90
+ def each( &block )
91
+ if block_given?
92
+ @input.each_line do |line|
93
+
94
+ logger.debug "line:" if logger.debug?
95
+ logger.debug line.pretty_inspect if logger.debug?
96
+
97
+
98
+ ## note: chomp('') if is an empty string,
99
+ ## it will remove all trailing newlines from the string.
100
+ ## use line.sub(/[\n\r]*$/, '') or similar instead - why? why not?
101
+ line = line.chomp( '' )
102
+ line = line.strip ## strip leading and trailing whitespaces (space/tab) too
103
+ logger.debug line.pretty_inspect if logger.debug?
104
+
105
+ if line.empty? ## skip blank lines
106
+ logger.debug "skip blank line" if logger.debug?
107
+ next
108
+ end
109
+
110
+ if line.start_with?( "#" ) ## skip comment lines
111
+ logger.debug "skip comment line" if logger.debug?
112
+ next
113
+ end
114
+
115
+ ## note: auto-wrap in array e.g. with []
116
+ json = JSON.parse( "[#{line}]" )
117
+ logger.debug json.pretty_inspect if logger.debug?
118
+ block.call( json )
119
+ end
120
+ else
121
+ to_enum
122
+ end
123
+ end # method each
124
+
125
+ def read() to_a; end # method read
126
+
127
+ def close
128
+ @input.close if @input.respond_to?(:close) ## note: string needs no close
129
+ end
130
+
131
+ end # class CsvJson
@@ -4,9 +4,9 @@
4
4
 
5
5
  class CsvJson
6
6
 
7
- MAJOR = 0
7
+ MAJOR = 1
8
8
  MINOR = 0
9
- PATCH = 1
9
+ PATCH = 0
10
10
  VERSION = [MAJOR,MINOR,PATCH].join('.')
11
11
 
12
12
  def self.version
@@ -0,0 +1,3 @@
1
+ 1,"John","12 Totem Rd. Aspen",true
2
+ 2,"Bob",null,false
3
+ 3,"Sue","Bigsby, 345 Carnival, WA 23009",false
@@ -0,0 +1,5 @@
1
+ # hello world
2
+
3
+ 1, "John", "12 Totem Rd. Aspen", true
4
+ 2, "Bob", null, false
5
+ 3, "Sue", "Bigsby, 345 Carnival, WA 23009", false
data/test/helper.rb ADDED
@@ -0,0 +1,20 @@
1
+ ## $:.unshift(File.dirname(__FILE__))
2
+
3
+ ## minitest setup
4
+
5
+ require 'minitest/autorun'
6
+
7
+
8
+ ## our own code
9
+ require 'csvjson'
10
+
11
+
12
+ ## add test_data_dir helper
13
+ class CsvJson
14
+ def self.test_data_dir
15
+ "#{root}/test/data"
16
+ end
17
+ end
18
+
19
+
20
+ CsvJson.logger.level = :debug ## turn on "global" logging
@@ -0,0 +1,104 @@
1
+ # encoding: utf-8
2
+
3
+ ###
4
+ # to run use
5
+ # ruby -I ./lib -I ./test test/test_parser.rb
6
+
7
+
8
+ require 'helper'
9
+
10
+ class TestParser < MiniTest::Test
11
+
12
+
13
+ def parser
14
+ CsvJson
15
+ end
16
+
17
+ def records ## "standard" records for testing
18
+ [[1, "John", "12 Totem Rd. Aspen", true],
19
+ [2, "Bob", nil, false],
20
+ [3, "Sue", "Bigsby, 345 Carnival, WA 23009", false]]
21
+ end
22
+
23
+
24
+
25
+ def test_parse
26
+ assert_equal records, parser.parse( <<TXT )
27
+ 1,"John","12 Totem Rd. Aspen",true
28
+ 2,"Bob",null,false
29
+ 3,"Sue","Bigsby, 345 Carnival, WA 23009",false
30
+ TXT
31
+
32
+ assert_equal records, parser.parse( <<TXT )
33
+ # hello world
34
+
35
+ 1,"John","12 Totem Rd. Aspen",true
36
+ 2,"Bob",null,false
37
+ 3,"Sue","Bigsby, 345 Carnival, WA 23009",false
38
+ TXT
39
+
40
+ assert_equal records, parser.parse( <<TXT )
41
+ # hello world (pretty printed)
42
+
43
+ 1, "John", "12 Totem Rd. Aspen", true
44
+ 2, "Bob", null, false
45
+ 3, "Sue", "Bigsby, 345 Carnival, WA 23009", false
46
+
47
+ # try more comments and empty lines
48
+
49
+ TXT
50
+
51
+
52
+ txt =<<TXT
53
+ # hello world
54
+
55
+ 1,"John","12 Totem Rd. Aspen",true
56
+ 2,"Bob",null,false
57
+ 3,"Sue","Bigsby, 345 Carnival, WA 23009",false
58
+ TXT
59
+
60
+ recs = []
61
+ parser.parse( txt ) { |rec| recs << rec }
62
+ assert_equal records, recs
63
+ end
64
+
65
+
66
+ def test_read
67
+ assert_equal records, parser.read( "#{CsvJson.test_data_dir}/hello.json.csv" )
68
+ assert_equal records, parser.read( "#{CsvJson.test_data_dir}/hello11.json.csv" )
69
+ end
70
+
71
+
72
+ def test_open
73
+ assert_equal records, parser.open( "#{CsvJson.test_data_dir}/hello.json.csv", "r:bom|utf-8" ).read
74
+ assert_equal records, parser.open( "#{CsvJson.test_data_dir}/hello11.json.csv", "r:bom|utf-8" ).read
75
+ end
76
+
77
+
78
+ def test_foreach
79
+ recs = []
80
+ parser.foreach( "#{CsvJson.test_data_dir}/hello.json.csv" ) { |rec| recs << rec }
81
+ assert_equal records, recs
82
+
83
+ recs = []
84
+ parser.foreach( "#{CsvJson.test_data_dir}/hello11.json.csv" ) { |rec| recs << rec }
85
+ assert_equal records, recs
86
+ end
87
+
88
+
89
+ def test_enum
90
+ csv = CsvJson.new( <<TXT )
91
+ # hello world
92
+
93
+ 1,"John","12 Totem Rd. Aspen",true
94
+ 2,"Bob",null,false
95
+ 3,"Sue","Bigsby, 345 Carnival, WA 23009",false
96
+ TXT
97
+
98
+ it = csv.to_enum
99
+ assert_equal [1, "John", "12 Totem Rd. Aspen", true], it.next
100
+ assert_equal [2, "Bob", nil, false], it.next
101
+ assert_equal [3, "Sue", "Bigsby, 345 Carnival, WA 23009", false], it.next
102
+ end
103
+
104
+ end # class TestParser
@@ -0,0 +1,80 @@
1
+ # encoding: utf-8
2
+
3
+ ###
4
+ # to run use
5
+ # ruby -I ./lib -I ./test test/test_parser_misc.rb
6
+
7
+
8
+ require 'helper'
9
+
10
+ class TestParserMisc < MiniTest::Test
11
+
12
+
13
+ def parser
14
+ CsvJson
15
+ end
16
+
17
+
18
+ def test_quotes_and_commas
19
+ assert_equal [
20
+ [1, "John", "12 Totem Rd., Aspen", true],
21
+ [2, "Bob", nil, false],
22
+ [3, "Sue", "\"Bigsby\", 345 Carnival, WA 23009", false]
23
+ ], parser.parse( <<TXT )
24
+ 1,"John","12 Totem Rd., Aspen",true
25
+ 2,"Bob",null,false
26
+ 3,"Sue","\\"Bigsby\\", 345 Carnival, WA 23009",false
27
+ TXT
28
+ end
29
+
30
+
31
+ def test_arrays
32
+ assert_equal [
33
+ [1, "directions", ["north","south","east","west"]],
34
+ [2, "colors", ["red","green","blue"]],
35
+ [3, "drinks", ["soda","water","tea","coffe"]],
36
+ [4, "spells", []],
37
+ ], parser.parse( <<TXT )
38
+ # CSV <3 JSON with array values
39
+
40
+ 1,"directions",["north","south","east","west"]
41
+ 2,"colors",["red","green","blue"]
42
+ 3,"drinks",["soda","water","tea","coffe"]
43
+ 4,"spells",[]
44
+ TXT
45
+ end
46
+
47
+ def test_misc
48
+ ## note:
49
+ ## in the csv <3 json source text backslash needs to get doubled / escaped twice e.g.
50
+ ## \\" for quotes
51
+ ## \\n for newlines and so on
52
+
53
+ assert_equal [
54
+ ["index", "value1", "value2"],
55
+ ["number", 1, 2],
56
+ ["boolean", false, true],
57
+ ["null", nil, "non null"],
58
+ ["array of numbers", [1], [1,2]],
59
+ ["simple object", {"a" => 1}, {"a" => 1, "b" => 2}],
60
+ ["array with mixed objects", [1, nil,"ball"], [2,{"a" => 10, "b" => 20},"cube"]],
61
+ ["string with quotes", "a\"b", "alert(\"Hi!\")"],
62
+ ["string with bell&newlines","bell is \u0007","multi\nline\ntext"]
63
+ ], parser.parse( <<TXT )
64
+ # CSV with all kinds of values
65
+
66
+ "index","value1","value2"
67
+ "number",1,2
68
+ "boolean",false,true
69
+ "null",null,"non null"
70
+ "array of numbers",[1],[1,2]
71
+ "simple object",{"a": 1},{"a":1, "b":2}
72
+ "array with mixed objects",[1,null,"ball"],[2,{"a": 10, "b": 20},"cube"]
73
+ "string with quotes","a\\"b","alert(\\"Hi!\\")"
74
+ "string with bell&newlines","bell is \\u0007","multi\\nline\\ntext"
75
+ TXT
76
+
77
+ end
78
+
79
+
80
+ end # class TestParserMisc
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: csvjson
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.1
4
+ version: 1.0.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Gerald Bauer
@@ -54,7 +54,13 @@ files:
54
54
  - README.md
55
55
  - Rakefile
56
56
  - lib/csvjson.rb
57
+ - lib/csvjson/parser.rb
57
58
  - lib/csvjson/version.rb
59
+ - test/data/hello.json.csv
60
+ - test/data/hello11.json.csv
61
+ - test/helper.rb
62
+ - test/test_parser.rb
63
+ - test/test_parser_misc.rb
58
64
  homepage: https://github.com/csv11/csvjson
59
65
  licenses:
60
66
  - Public Domain