csvhuman 0.1.0 → 0.2.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 358d150c2a69a16f741b0dae47328857787d2dc2
4
- data.tar.gz: fd36923138a7453510d2d26a4e2997c4475b3aea
3
+ metadata.gz: 0e03d4dc51acff7d6b47f1648abb47cfaa2a9028
4
+ data.tar.gz: b4921c44a67c57feae5c1f62eff5aa87ef81c996
5
5
  SHA512:
6
- metadata.gz: 1540846d223cb4bcf8dd4d2982f5cdc96b13966328f4037d84d6b1a7f8a3bee998856edcc3aaeb7d14e476f12e5a3cc5fd81d81a6aeed6a44e55c39a24cd8144
7
- data.tar.gz: 2fbc7a4ee6f22f75ab4cdea35c5411fc311baf4185d5ebc3012ae4b99a43d301a93a70ff388ac58aaa3eec0eb630d9c5742e94e67cb583214d88dfb05fadebad
6
+ metadata.gz: 675050a1e5af601ea6634fe17c0dcea511c917170438469c5f09349e4bd26678b5d42cc7cd5b9c97c2455b61ea67cd5719bff7c4637971849ae056955d562f2b
7
+ data.tar.gz: 9a7da3cdf466ebfec142344c505558b2c86fd38bee9c3b7d766c69cd0d127e5c42f4f854a6f61df7e4c57610ea4b2bec3bbdc121821a9c72b875cb470af2b50f
@@ -3,8 +3,11 @@ Manifest.txt
3
3
  README.md
4
4
  Rakefile
5
5
  lib/csvhuman.rb
6
+ lib/csvhuman/column.rb
6
7
  lib/csvhuman/reader.rb
8
+ lib/csvhuman/tag.rb
7
9
  lib/csvhuman/version.rb
8
10
  test/data/test.csv
9
11
  test/helper.rb
10
12
  test/test_reader.rb
13
+ test/test_tags.rb
data/README.md CHANGED
@@ -10,9 +10,221 @@ csvhuman library / gem - read tabular data in the CSV Humanitarian eXchange Lang
10
10
 
11
11
 
12
12
 
13
+
14
+ ## What's Humanitarian eXchange Language (HXL)?
15
+
16
+ [Humanitarian eXchange Language (HXL)](https://github.com/csvspecs/csv-hxl)
17
+ is a (meta data) convention for
18
+ adding agreed on hashtags e.g. `#org,#country,#sex+#targeted,#adm1`
19
+ inline in a (single new line / row)
20
+ between the last header row and the first data row
21
+ for sharing tabular data across organisations
22
+ (during a humanitarian crisis).
23
+ Example:
24
+
25
+
26
+ ```
27
+ What,,,Who,Where,For whom,
28
+ Record,Sector/Cluster,Subsector,Organisation,Country,Males,Females,Subregion
29
+ ,#sector+en,#subsector,#org,#country,#sex+#targeted,#sex+#targeted,#adm1
30
+ 001,WASH,Subsector 1,Org 1,Country 1,100,100,Region 1
31
+ 002,Health,Subsector 2,Org 2,Country 2,,,Region 2
32
+ 003,Education,Subsector 3,Org 3,Country 2,250,300,Region 3
33
+ 004,WASH,Subsector 4,Org 1,Country 3,80,95,Region 4
34
+ ```
35
+
36
+
13
37
  ## Usage
14
38
 
15
- to be done
39
+ Pass in an array of arrays (or a stream responding to `#each` with an array of strings).
40
+ Example:
41
+
42
+
43
+ ``` ruby
44
+ pp CsvHuman.parse( [["Organisation", "Cluster", "Province" ], ## or use HXL.parse
45
+ [ "#org", "#sector", "#adm1" ],
46
+ [ "Org A", "WASH", "Coastal Province" ],
47
+ [ "Org B", "Health", "Mountain Province" ],
48
+ [ "Org C", "Education", "Coastal Province" ],
49
+ [ "Org A", "WASH", "Plains Province" ]]
50
+ ```
51
+
52
+ resulting in:
53
+
54
+ ``` ruby
55
+ [{"org" => "Org A", "sector" => "WASH", "adm1" => "Coastal Province"},
56
+ {"org" => "Org B", "sector" => "Health", "adm1" => "Mountain Province"},
57
+ {"org" => "Org C", "sector" => "Education", "adm1" => "Coastal Province"},
58
+ {"org" => "Org A", "sector" => "WASH", "adm1" => "Plains Province"}]
59
+ ```
60
+
61
+ Or pass in the text. Example:
62
+
63
+ ``` ruby
64
+ pp CsvHuman.parse( <<TXT ) ## or use HXL.parse
65
+ What,,,Who,Where,For whom,
66
+ Record,Sector/Cluster,Subsector,Organisation,Country,Males,Females,Subregion
67
+ ,#sector+en,#subsector,#org,#country,#sex+#targeted,#sex+#targeted,#adm1
68
+ 001,WASH,Subsector 1,Org 1,Country 1,100,100,Region 1
69
+ 002,Health,Subsector 2,Org 2,Country 2,,,Region 2
70
+ 003,Education,Subsector 3,Org 3,Country 2,250,300,Region 3
71
+ 004,WASH,Subsector 4,Org 1,Country 3,80,95,Region 4
72
+ TXT
73
+ ```
74
+
75
+ resulting in:
76
+
77
+ ```
78
+ [{"sector+en" => "WASH",
79
+ "subsector" => "Subsector 1",
80
+ "org" => "Org 1",
81
+ "country" => "Country 1",
82
+ "sex+targeted" => ["100", "100"],
83
+ "adm1" => "Region 1"},
84
+ {"sector+en" => "Health",
85
+ "subsector" => "Subsector 2",
86
+ "org" => "Org 2",
87
+ "country" => "Country 2",
88
+ "sex+targeted" => ["", ""],
89
+ "adm1" => "Region 2"},
90
+ {"sector+en" => "Education",
91
+ "subsector" => "Subsector 3",
92
+ "org" => "Org 3",
93
+ "country" => "Country 2",
94
+ "sex+targeted" => ["250", "300"],
95
+ "adm1" => "Region 3"},
96
+ {"sector+en" => "WASH",
97
+ "subsector" => "Subsector 4",
98
+ "org" => "Org 1",
99
+ "country" => "Country 3",
100
+ "sex+targeted" => ["80", "95"],
101
+ "adm1" => "Region 4"}]
102
+ ```
103
+
104
+
105
+ More ways to use the reader:
106
+
107
+ ``` ruby
108
+ csv = CsvHuman.new( recs )
109
+ csv.each do |rec|
110
+ pp rec
111
+ end
112
+
113
+ pp csv.read
114
+
115
+
116
+ CsvHuman.parse( recs ).each do |rec|
117
+ pp rec
118
+ end
119
+
120
+
121
+ pp CsvHuman.read( "./test.csv" )
122
+
123
+ CsvHuman.foreach( "./test.csv" ) do |rec|
124
+ pp rec
125
+ end
126
+
127
+ #...
128
+
129
+ ```
130
+
131
+ or use the `HXL` alias:
132
+
133
+ ``` ruby
134
+ hxl = HXL.new( recs )
135
+ hxl.each do |rec|
136
+ pp rec
137
+ end
138
+
139
+ pp hxl.read
140
+
141
+
142
+ HXL.parse( recs ).each do |rec|
143
+ pp rec
144
+ end
145
+
146
+
147
+ pp HXL.read( "./test.csv" )
148
+
149
+ HXL.foreach( "./test.csv" ) do |rec|
150
+ pp rec
151
+ end
152
+
153
+ #...
154
+ ```
155
+
156
+ Note: More aliases for `CsvHuman`, `HXL`? Yes, you can use
157
+ `CsvHum`, `CSV_HXL`, `CSVHXL` too.
158
+
159
+
160
+
161
+
162
+
163
+ ## Tag Helpers
164
+
165
+ **Normalize**. Use `CsvHuman::Tag.normalize` to pretty print or normalize a tag.
166
+ All parts get downcased (lowercased), all attributes sorted by a-to-z,
167
+ all extra or missing hashtags or pluses added or removed
168
+ all extra or missing spaces added or removed. Example:
169
+
170
+ ``` ruby
171
+ HXL::Tag.normalize( "#sector+en" )
172
+ # => "#sector +en"
173
+ HXL::Tag.normalize( "#SECTOR EN" )
174
+ # => "#sector +en"
175
+ HXL::Tag.normalize( "# SECTOR + #EN " )
176
+ # => "#sector +en"
177
+ HXL::Tag.normalize( "SECTOR EN" )
178
+ # => "#sector +en"
179
+ # ...
180
+ ```
181
+
182
+
183
+ **Split**. Use `CsvHuman::Tag.split` to split (and normalize) a tag into its parts.
184
+ Example:
185
+
186
+ ``` ruby
187
+ HXL::Tag.split( "#sector+en" )
188
+ # => ["sector", "en"]
189
+ HXL::Tag.split( "#SECTOR EN" )
190
+ # => ["sector", "en"]
191
+ HXL::Tag.split( "# SECTOR + #EN " )
192
+ # => ["sector", "en"]
193
+ HXL::Tag.split( "SECTOR EN" )
194
+ # => ["sector", "en"]
195
+
196
+ ## sort attributes a-to-z
197
+ HXL::Tag.split( "#affected +f +children" )
198
+ # => ["affected", "children", "f"]
199
+ HXL::Tag.split( "#population +children +affected +m" )
200
+ # => ["population", "affected", "children", "m"]
201
+ HXL::Tag.split( "#population+children+affected+m" )
202
+ # => ["population", "affected", "children", "m"]
203
+ HXL::Tag.split( "#population+#children+#affected+#m" )
204
+ # => ["population", "affected", "children", "m"]
205
+ HXL::Tag.split( "#population #children #affected #m" )
206
+ # => ["population", "affected", "children", "m"]
207
+ HXL::Tag.split( "POPULATION CHILDREN AFFECTED M" )
208
+ # => ["population", "affected", "children", "m"]
209
+ #...
210
+ ```
211
+
212
+
213
+
214
+
215
+ ## Frequently Asked Questions (FAQ) and Answers
216
+
217
+
218
+ ### Q: How to deal with un-tagged fields?
219
+
220
+ **A**: Un-tagged fields get skipped / ignored.
221
+
222
+
223
+ ### Q: How to deal with duplicate / repeated fields (e.g. `#sex+#targeted,#sex+#targeted`)?
224
+
225
+ **A**: Repeated fields (auto-magically) get turned into an array / list.
226
+
227
+
16
228
 
17
229
 
18
230
  ## License
@@ -1,13 +1,14 @@
1
1
  # encoding: utf-8
2
2
 
3
3
  require 'pp'
4
- require 'logger'
5
4
 
6
5
 
7
6
  require 'csvreader'
8
7
 
9
8
  ## our own code
10
9
  require 'csvhuman/version' # note: let version always go first
10
+ require 'csvhuman/tag'
11
+ require 'csvhuman/column'
11
12
  require 'csvhuman/reader'
12
13
 
13
14
 
@@ -0,0 +1,89 @@
1
+ # encoding: utf-8
2
+
3
+
4
+ class CsvHuman
5
+
6
+
7
+ class Columns
8
+
9
+
10
+ def self.build( values )
11
+
12
+ ## "clean" unify/normalize names
13
+ tag_keys = values.map do |value|
14
+ if value
15
+ if value.empty?
16
+ nil
17
+ else
18
+ ## e.g. #ADM1 CODE => #adm1 +code
19
+ ## POPULATION F CHILDREN AFFECTED => #population +affected +children +f
20
+ value = Tag.normalize( value )
21
+ ## turn empty normalized tags (e.g. "stray" hashtag) into nil too
22
+ value = nil if value.empty?
23
+ value
24
+ end
25
+ else # keep (nil) as is
26
+ nil
27
+ end
28
+ end
29
+
30
+
31
+ counts = {}
32
+ tag_keys.each_with_index do |key,i|
33
+ if key
34
+ counts[key] ||= []
35
+ counts[key] << i
36
+ end
37
+ end
38
+ ## puts "counts:"
39
+ ## pp counts
40
+
41
+ ## create all unique tags
42
+ tags = {}
43
+ counts.each_key do |key|
44
+ tags[key] = Tag.parse( key )
45
+ end
46
+ ## puts "tags:"
47
+ ## pp tags
48
+
49
+
50
+ cols = []
51
+ tag_keys.each do |key|
52
+ if key
53
+ count = counts[key]
54
+ tag = tags[key] ## note: "reuse" tag for all columns if list
55
+ if count.size > 1
56
+ ## note: defaults to use "standard/default" tag key (as a string)
57
+ cols << Column.new( tag.key, tag, list: true )
58
+ else
59
+ cols << Column.new( tag.key, tag )
60
+ end
61
+ else
62
+ cols << Column.new
63
+ end
64
+ end
65
+
66
+ cols
67
+ end
68
+ end ## class Columns
69
+
70
+
71
+
72
+
73
+ class Column
74
+ attr_reader :key # used for record (record key); note: list columns must use the same key
75
+ attr_reader :tag
76
+
77
+
78
+ def initialize( key=nil, tag=nil, list: false )
79
+ @key = key
80
+ @tag = tag
81
+ @list = list
82
+ end
83
+
84
+
85
+ def tagged?() @tag.nil? == false; end
86
+ def list?() @list; end
87
+ end # class Column
88
+
89
+ end # class CsvHuman
@@ -65,21 +65,6 @@ class CsvHuman
65
65
 
66
66
 
67
67
 
68
-
69
- class Column
70
- attr_reader :tag
71
-
72
- def initialize( tag=nil, list: false )
73
- @tag = tag
74
- @list = list
75
- end
76
-
77
- def tagged?() @tag.nil? == false; end
78
- def list?() @list; end
79
- end # class Column
80
-
81
-
82
-
83
68
  attr_reader :header, :tags
84
69
 
85
70
  def initialize( recs_or_stream )
@@ -106,8 +91,8 @@ def each( &block )
106
91
  @recs.each do |values|
107
92
  ## pp values
108
93
  if @cols.nil?
109
- if values.any? { |value| value && value.start_with?('#') }
110
- @cols = build_cols( values )
94
+ if values.any? { |value| value && value.strip.start_with?('#') }
95
+ @cols = Columns.build( values )
111
96
  @tags = values
112
97
  else
113
98
  @header << values
@@ -119,8 +104,8 @@ def each( &block )
119
104
  record = {}
120
105
  @cols.each_with_index do |col,i|
121
106
  if col.tagged?
122
- key = col.tag
123
- value = values[i]
107
+ key = col.key
108
+ value = values[i] ## todo/fix: use col.tag.typecast( values[i] )
124
109
  if col.list?
125
110
  record[ key ] ||= []
126
111
  record[ key ] << value
@@ -144,54 +129,4 @@ def read() to_a; end # method read
144
129
  ## add closed? and close
145
130
  ## if self.open used without block (user needs to close file "manually")
146
131
 
147
-
148
- ####
149
- # helpers
150
-
151
-
152
- def build_cols( values )
153
-
154
- ## "clean" unify/normalize names
155
- values = values.map do |value|
156
- if value
157
- if value.empty?
158
- nil ## make untagged fields nil
159
- else
160
- ## todo: sort attributes by a-to-z
161
- ## strip / remove all spaces
162
- value.strip.gsub('#','') ## remove leading # - why? why not?
163
- end
164
- else
165
- value ## keep (nil) as is
166
- end
167
- end
168
-
169
-
170
- counts = {}
171
- values.each_with_index do |value,i|
172
- if value
173
- counts[value] ||= []
174
- counts[value] << i
175
- end
176
- end
177
- ## pp counts
178
-
179
-
180
- cols = []
181
- values.each do |value|
182
- if value
183
- count = counts[value]
184
- if count.size > 1
185
- cols << Column.new( value, list: true )
186
- else
187
- cols << Column.new( value )
188
- end
189
- else
190
- cols << Column.new
191
- end
192
- end
193
-
194
- cols
195
- end
196
-
197
132
  end # class CsvHuman
@@ -0,0 +1,162 @@
1
+ # encoding: utf-8
2
+
3
+ class CsvHuman
4
+
5
+
6
+
7
+ class Tag
8
+
9
+ ## 1) plus (with optional hashtag and/or optional leading and trailing spaces)
10
+ ## 2) hashtag (with optional leading and trailing spaces)
11
+ ## 3) spaces only (not followed by plus) or
12
+ ## note: plus pattern must go first (otherwise "sector + en" becomes ["sector", "", "en"])
13
+ SEP_REGEX = /(?: \s*\++
14
+ (?:\s*\#+)?
15
+ \s* )
16
+ |
17
+ (?: \s*\#+\s* )
18
+ |
19
+ (?: \s+)
20
+ /x ## check if \s includes space AND tab?
21
+
22
+
23
+
24
+ def self.split( value )
25
+ value = value.strip
26
+ value = value.downcase
27
+ while value.start_with?('#') do ## allow one or more hashes
28
+ value = value[1..-1] ## remove leading #
29
+ value = value.strip ## strip (optional) leading spaces (again)
30
+ end
31
+ ## pp value
32
+ parts = value.split( SEP_REGEX )
33
+
34
+ ## sort attributes a-z
35
+ if parts.size > 2
36
+ [parts[0]] + parts[1..-1].sort
37
+ else
38
+ parts
39
+ end
40
+ end
41
+
42
+
43
+ def self.normalize( value ) ## todo: rename to pretty or something or add alias
44
+ parts = split( value )
45
+ name = parts[0]
46
+ attributes = parts[1..-1] ## note: might be nil
47
+
48
+ buf = ''
49
+ if name ## note: name might be nil too e.g. value = "" or value = " "
50
+ buf << '#' + name
51
+ if attributes && attributes.size > 0
52
+ buf << ' +'
53
+ buf << attributes.join(' +')
54
+ end
55
+ end
56
+ buf
57
+ end
58
+
59
+
60
+ def self.guess_type( name, attributes )
61
+
62
+ if name == 'date'
63
+ Date
64
+ elsif ['affected', 'inneed'].include?( name )
65
+ Integer
66
+ else
67
+ ## check attributes
68
+ if attributes.nil? || attributes.empty?
69
+ String ## assume (default to) string
70
+ elsif attributes.include?( 'num' )
71
+ Integer
72
+ elsif attributes.include?( 'date' ) ### todo/check: exists +date?
73
+ Date
74
+ elsif attributes.include?( 'affected' )
75
+ Integer
76
+ else
77
+ String ## assume (default to) string
78
+ end
79
+ end
80
+ end
81
+
82
+
83
+
84
+ def self.parse( value )
85
+ parts = split( value )
86
+
87
+ name = parts[0]
88
+ attributes = parts[1..-1] ## todo/fix: check if nil (make it empty array [] always) - why? why not?
89
+ type = guess_type( name, attributes )
90
+
91
+ new( name, attributes, type )
92
+ end
93
+
94
+
95
+
96
+
97
+ attr_reader :name
98
+ attr_reader :attributes ## use attribs or something shorter - why? why not?
99
+ attr_reader :type
100
+
101
+ def initialize( name, attributes=nil, type=String )
102
+ @name = name
103
+ ## sorted a-z - note: make sure attributes is [] NOT nil if empty - why? why not?
104
+ @attributes = attributes || []
105
+ @type = type ## type class (defaults to String)
106
+ end
107
+
108
+
109
+ def key
110
+ ## convenience short cut for "standard/default" string key
111
+ ## cache/pre-built/memoize - why? why not?
112
+ ## builds:
113
+ ## population+affected+children+f
114
+
115
+ buf = ''
116
+ buf << @name
117
+ if @attributes && @attributes.size > 0
118
+ buf << '+'
119
+ buf << @attributes.join('+')
120
+ end
121
+ buf
122
+ end
123
+
124
+ def to_s
125
+ ## cache/pre-built/memoize - why? why not?
126
+ ##
127
+ ## builds
128
+ ## #population +affected +children +f
129
+
130
+ buf = ''
131
+ buf << '#' + @name
132
+ if @attributes && @attributes.size > 0
133
+ buf << ' +'
134
+ buf << @attributes.join(' +')
135
+ end
136
+ buf
137
+ end
138
+
139
+
140
+ def typecast( value ) ## use convert or call - why? why not?
141
+ if @type == Integer
142
+ conv_to_i( value )
143
+ else ## assume String
144
+ # pass through as is
145
+ value
146
+ end
147
+ end
148
+
149
+ private
150
+ def conv_to_i( value )
151
+ if value.nil? || value.empty?
152
+ nil ## return nil - why? why not?
153
+ else
154
+ Integer( value )
155
+ end
156
+ end
157
+
158
+
159
+ end # class Tag
160
+
161
+
162
+ end # class CsvHuman
@@ -4,7 +4,7 @@
4
4
  class CsvHuman
5
5
 
6
6
  MAJOR = 0
7
- MINOR = 1
7
+ MINOR = 2
8
8
  PATCH = 0
9
9
  VERSION = [MAJOR,MINOR,PATCH].join('.')
10
10
 
@@ -18,6 +18,26 @@ def recs
18
18
  [ "Org A", "WASH", "Plains Province" ]]
19
19
  end
20
20
 
21
+ def recs2
22
+ [["Organisation", "Cluster", "Province" ],
23
+ [ "ORG", "#SECTOR", "ADM1" ],
24
+ [ "Org A", "WASH", "Coastal Province" ],
25
+ [ "Org B", "Health", "Mountain Province" ],
26
+ [ "Org C", "Education", "Coastal Province" ],
27
+ [ "Org A", "WASH", "Plains Province" ]]
28
+ end
29
+
30
+
31
+ def expected_recs
32
+ [{"org"=>"Org A", "sector"=>"WASH", "adm1"=>"Coastal Province"},
33
+ {"org"=>"Org B", "sector"=>"Health", "adm1"=>"Mountain Province"},
34
+ {"org"=>"Org C", "sector"=>"Education", "adm1"=>"Coastal Province"},
35
+ {"org"=>"Org A", "sector"=>"WASH", "adm1"=>"Plains Province"}]
36
+ end
37
+
38
+
39
+
40
+
21
41
  def txt
22
42
  <<TXT
23
43
  What,,,Who,Where,For whom,
@@ -38,7 +58,10 @@ def test_readme
38
58
  end
39
59
 
40
60
  pp csv.read
41
- pp CsvHuman.parse( recs )
61
+
62
+ assert_equal expected_recs, CsvHuman.parse( recs )
63
+ assert_equal expected_recs, CsvHuman.parse( recs2 )
64
+
42
65
 
43
66
  CsvHuman.parse( recs ).each do |rec|
44
67
  pp rec
@@ -0,0 +1,106 @@
1
+ # encoding: utf-8
2
+
3
+ ###
4
+ # to run use
5
+ # ruby -I ./lib -I ./test test/test_tags.rb
6
+
7
+
8
+ require 'helper'
9
+
10
+ class TestTags < MiniTest::Test
11
+
12
+ def split( value )
13
+ CsvHuman::Tag.split( value ) ## returns an array of strings (name+attributes[])
14
+ end
15
+
16
+ def normalize( value )
17
+ CsvHuman::Tag.normalize( value ) ## returns a string
18
+ end
19
+
20
+ def parse( value )
21
+ CsvHuman::Tag.parse( value ) ## returns a Tag class
22
+ end
23
+
24
+
25
+
26
+ def test_split
27
+ assert_equal [], split( "" ) # empty
28
+ assert_equal [], split( " " ) # empty
29
+
30
+ ## more empties (all matched by separator regex/pattern)
31
+ ## keep as empty - why? why not?
32
+ assert_equal [], split( " # " ) # empty
33
+ assert_equal [], split( " ## " ) # empty
34
+ assert_equal [], split( " + " ) # empty
35
+ assert_equal [], split( " +++ " ) # empty
36
+ assert_equal [], split( " +++## " ) # empty
37
+
38
+
39
+ assert_equal ["sector", "en"], split( "#sector+en" )
40
+ assert_equal ["sector", "en"], split( "#SECTOR EN" )
41
+ assert_equal ["sector", "en"], split( " # SECTOR + EN " )
42
+ assert_equal ["sector", "en"], split( "SeCtOr en" )
43
+ assert_equal ["sector", "en"], split( "#sector#en" )
44
+ assert_equal ["sector", "en"], split( "#sector+#en" ) ## allow (optional) hash for attributes
45
+ assert_equal ["sector", "en"], split( "##sector#en" ) ## allow hash only for attributes
46
+ assert_equal ["sector", "en"], split( "# #sector+++ ##en" ) ## allow one or more plus or hashes (typos) for attibutes
47
+
48
+
49
+ assert_equal ["adm1", "code"], split( "#ADM1 +CODE" )
50
+ assert_equal ["adm1", "code"], split( " # ADM1 + CODE" )
51
+ assert_equal ["adm1", "code"], split( "ADM1 CODE" )
52
+
53
+ ## sort attributes a-to-z
54
+ assert_equal ["affected", "children", "f"], split( "#affected +f +children" )
55
+ assert_equal ["population", "affected", "children", "m"], split( "#population +children +affected +m" )
56
+ assert_equal ["population", "affected", "children", "m"], split( "#population+children+affected+m" )
57
+ assert_equal ["population", "affected", "children", "m"], split( "#population+#children+#affected+#m" )
58
+ assert_equal ["population", "affected", "children", "m"], split( "#population #children #affected #m" )
59
+ assert_equal ["population", "affected", "children", "m"], split( "POPULATION CHILDREN AFFECTED M" )
60
+ end
61
+
62
+
63
+ def test_normalize
64
+ assert_equal "", normalize( "" ) # empty
65
+ assert_equal "", normalize( " " ) # empty
66
+
67
+ assert_equal "#sector +en", normalize( "#sector+en" )
68
+ assert_equal "#sector +en", normalize( "#SECTOR EN" )
69
+ assert_equal "#sector +en", normalize( " # SECTOR + EN " )
70
+ assert_equal "#sector +en", normalize( " # SECTOR # EN " )
71
+ assert_equal "#sector +en", normalize( "SeCToR en" )
72
+
73
+ assert_equal "#adm1 +code", normalize( "#ADM1 +CODE" )
74
+ assert_equal "#adm1 +code", normalize( " # ADM1 + CODE" )
75
+ assert_equal "#adm1 +code", normalize( " # ADM1 + #CODE" )
76
+ assert_equal "#adm1 +code", normalize( "ADM1 Code" )
77
+
78
+ ## sort attributes a-to-z
79
+ assert_equal "#affected +children +f", normalize( "#affected +f +children" )
80
+ assert_equal "#population +affected +children +m", normalize( "#population +children +affected +m" )
81
+ assert_equal "#population +affected +children +m", normalize( "#population+children+affected+m" )
82
+ assert_equal "#population +affected +children +m", normalize( "POPULATION CHILDREN AFFECTED M" )
83
+ end
84
+
85
+
86
+ def test_parse
87
+ tag = parse( "#sector+en" )
88
+ assert_equal "#sector +en", tag.to_s
89
+ assert_equal "sector", tag.name
90
+ assert_equal ["en"], tag.attributes
91
+ assert_equal String, tag.type
92
+
93
+ assert_equal "#sector +en", parse( "#SECTOR EN" ).to_s
94
+ assert_equal "#sector +en", parse( " # SECTOR + EN " ).to_s
95
+
96
+
97
+ tag = parse( "#adm1" )
98
+ assert_equal "#adm1", tag.to_s
99
+ assert_equal "adm1", tag.name
100
+ assert_equal [], tag.attributes
101
+ assert_equal String, tag.type
102
+
103
+ assert_equal "#adm1", parse( "ADM1" ).to_s
104
+ end
105
+
106
+ end # class TestTags
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: csvhuman
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Gerald Bauer
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2018-11-06 00:00:00.000000000 Z
11
+ date: 2018-11-10 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: csvreader
@@ -68,11 +68,14 @@ files:
68
68
  - README.md
69
69
  - Rakefile
70
70
  - lib/csvhuman.rb
71
+ - lib/csvhuman/column.rb
71
72
  - lib/csvhuman/reader.rb
73
+ - lib/csvhuman/tag.rb
72
74
  - lib/csvhuman/version.rb
73
75
  - test/data/test.csv
74
76
  - test/helper.rb
75
77
  - test/test_reader.rb
78
+ - test/test_tags.rb
76
79
  homepage: https://github.com/csvreader/csvhuman
77
80
  licenses:
78
81
  - Public Domain