sycsvpro 0.1.1 → 0.1.2

Sign up to get free protection for your applications and to get access to all the features.
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- sycsvpro (0.1.1)
4
+ sycsvpro (0.1.2)
5
5
  gli (= 2.9.0)
6
6
 
7
7
  GEM
data/README.md CHANGED
@@ -109,7 +109,9 @@ Allocate all the machine types to the customer
109
109
 
110
110
  Count
111
111
  -----
112
- Count all customers (key column) in rows 2 to 20 that have machines that start with *h* and have a contract valid beginning after 1.1.2000. Add a sum row with title Total at column 1
112
+ Count all customers (key column) in rows 2 to 20 that have machines that start
113
+ with *h* and have a contract valid beginning after 1.1.2000. Add a sum row with
114
+ title Total at column 1
113
115
 
114
116
  $ sycsvpro -f in.csv -o out.csv count -r 2-20 -k 0:customer -c 1:/^h/,5:">1.1.2000" --df "%d.%m.%Y" -s "Total:1"
115
117
 
@@ -126,7 +128,8 @@ It is possible to use multiple key columns `-k 0:customer,1:machines`
126
128
 
127
129
  Aggregate
128
130
  ---------
129
- Aggregate row values and add the sum to the end of the row. In the example we aggregate the customer names.
131
+ Aggregate row values and add the sum to the end of the row. In the example we
132
+ aggregate the customer names.
130
133
 
131
134
  $ sycsvpro -f in.csv -o out.csv aggregate -c 0 -s Total:1,Sum
132
135
 
@@ -141,7 +144,8 @@ The aggregation result in out.csv is
141
144
 
142
145
  Calc
143
146
  ----
144
- Process arithmetic operations on the contract count and create a target column and a sum which is added at the end of the result file
147
+ Process arithmetic operations on the contract count and create a target column
148
+ and a sum which is added at the end of the result file
145
149
 
146
150
  $ sycsvpro -f in.csv -o out.csv calc -r 2-20 -h *,target -c 6:*2,7:target=c6*10
147
151
 
@@ -154,11 +158,13 @@ Process arithmetic operations on the contract count and create a target column a
154
158
  chiro;c2;con331;dri100;mot130;3.05.3010;2;20
155
159
  0;0;0;0;0;0;10;100
156
160
 
157
- In the sum row non-numbers in the colums are converted to 0. Therefore column 0 is summed up to 0 as all strings are converted to 0.
161
+ In the sum row non-numbers in the colums are converted to 0. Therefore column 0
162
+ is summed up to 0 as all strings are converted to 0.
158
163
 
159
164
  Sort
160
165
  ----
161
- Sort rows on specified columns as an example sort rows based on customer (string s) and contract date (date d)
166
+ Sort rows on specified columns as an example sort rows based on customer
167
+ (string s) and contract date (date d)
162
168
 
163
169
  $ sycsvpro -f in.csv -o out.csv sort -r 2-20 -c s:0,d:5
164
170
 
@@ -169,35 +175,44 @@ Sort rows on specified columns as an example sort rows based on customer (string
169
175
  chiro;c2;con331;dri100;mot130;3.05.3010;1
170
176
  chiro;c1;con333;dri110;mot100;1.10.3011;1
171
177
 
172
- Sort expects the first non-empty row as the header row. If --headerless switch is set then sort assumes no header being available.
178
+ Sort expects the first non-empty row as the header row. If --headerless switch
179
+ is set then sort assumes no header being available.
173
180
 
174
181
  Insert
175
182
  ------
176
- Add rows at the bottom or on top of a file. The command below adds the content of the file file-with-rows-to-insert.text on top of the file in.csv and saves it to out.csv
183
+ Add rows at the bottom or on top of a file. The command below adds the content
184
+ of the file file-with-rows-to-insert.text on top of the file in.csv and saves
185
+ it to out.csv
177
186
 
178
187
  $ sycsvpro -f in.csv -o out.csv insert file-with-rows-to-insert.txt -p top
179
188
 
180
189
  Edit
181
190
  ----
182
- Creates or if it exists opens a file for editing. The file is created in the directory ~/.syc/sycsvpro/scripts. Following command creates a Ruby script with the name script.rb and a method call_me
191
+ Creates or if it exists opens a file for editing. The file is created in the
192
+ directory ~/.syc/sycsvpro/scripts. Following command creates a Ruby script with
193
+ the name script.rb and a method call_me
183
194
 
184
195
  $ sycsvpro edit -s script.rb -m call_me
185
196
 
186
197
  List
187
198
  ----
188
- List the scripts or insert-file available in the scripts directory
199
+ List the scripts, insert-file or all scripts available in the scripts directory
200
+ which is also displayed
189
201
 
202
+ script directory: ~/.syc/sycsvpro/scripts
190
203
  $ sycsvpro list -m
191
204
  script.rb
192
205
  call_me
193
206
 
194
207
  Execute
195
208
  -------
196
- Execute takes a Ruby script file as an argument and processes the script. The following command executes the script *script.rb* and invokes the method *calc*
209
+ Execute takes a Ruby script file as an argument and processes the script. The
210
+ following command executes the script *script.rb* and invokes the method *calc*
197
211
 
198
212
  $ sycsvpro execute ./script.rb calc
199
213
 
200
- Below is an example script file that is ultimately doing the same as the count command
214
+ Below is an example script file that is ultimately doing the same as the count
215
+ command
201
216
 
202
217
  $ sycsvpro -f in.csv -o out.csv count -r 1-20 -k 0 -c 4,5
203
218
 
@@ -232,15 +247,20 @@ def calc
232
247
  end
233
248
  ```
234
249
 
235
- *rows* and *write_to* are convenience methods provided by sycsvpro that can be used in script files to operate on files.
250
+ *rows* and *write_to* are convenience methods provided by sycsvpro that can be
251
+ used in script files to operate on files.
236
252
 
237
- *rows* will return values at the specified columns in the order they are provided in the call to
238
- rows. The columns to be returned in the block have to end with _column_ or _columns_ dependent if a value or an array should be returned. You can find the *rows* and *write_to* methods at _lib/sycsvpro/dsl.rb_.
253
+ *rows* will return values at the specified columns in the order they are
254
+ provided in the call to rows. The columns to be returned in the block have to
255
+ end with _column_ or _columns_ dependent if a value or an array should be
256
+ returned. You can find the *rows* and *write_to* methods at
257
+ _lib/sycsvpro/dsl.rb_.
239
258
 
240
259
  Working with sycsvpro
241
260
  =====================
242
261
 
243
- sycsvpro emerged from my daily work when cleaning and anaylzing data. If you want to dig deeper I would recommend [R](http://www.r-project.org/).
262
+ sycsvpro emerged from my daily work when cleaning and anaylzing data. If you
263
+ want to dig deeper I would recommend [R](http://www.r-project.org/).
244
264
 
245
265
  A work flow could be as follows
246
266
 
@@ -251,7 +271,29 @@ A work flow could be as follows
251
271
  * Do arithmetic operations on the values `calc`
252
272
  * Sort the rows based on column values
253
273
 
254
- When I have analyzed the data I use _Microsoft Excel_ or _LibreOffice Calc_ to create nice graphs. To create more sophisiticated analysis *R* is the right tool to use.
274
+ When I have analyzed the data I use _Microsoft Excel_ or _LibreOffice Calc_ to
275
+ create nice graphs. To create more sophisiticated analysis *R* is the right tool
276
+ to use.
277
+
278
+ Release notes
279
+ =============
280
+
281
+ Version 0.1.2
282
+ -------------
283
+ * Now it is possible to have , in the filter as non separating values. You can
284
+ now define filter like 1-2,4,/[56789]{2,}/,10
285
+ * Filtering rows on boolean expression based on values contained in columns.
286
+ The boolean expression has to be enclosed between BEGIN and END
287
+ Example:
288
+ -r BEGINs0=='Ruby'&&n1<1||d2==Date.new(2014,6,17)END
289
+ s0 - string in column 0
290
+ n1 - number in column 1
291
+ d2 - date in column 2
292
+ * ``list`` shows the directory of the script file and has the flag *all* to
293
+ show all scripts, that is _insert files_ and _Ruby files_
294
+ * When counting columns with *count* the column headers are sorted
295
+ alphabetically. No it is possible to set ``sort: false`` to keep the column
296
+ headers in the sequence they are specified
255
297
 
256
298
  Installation
257
299
  ============
data/bin/sycsvpro CHANGED
@@ -11,6 +11,12 @@ end
11
11
 
12
12
  include GLI::App
13
13
 
14
+ row_regex = %r{
15
+ \d+(?:,\d+|-\d+|-eof|,\/.*\/)*|
16
+ \/.*\/(?:,\/.*\/|\d+)*|
17
+ BEGIN.*?END
18
+ }xi
19
+
14
20
  # Directory holding configuration files
15
21
  sycsvpro_directory = File.expand_path("~/.syc/sycsvpro")
16
22
 
@@ -58,17 +64,24 @@ end
58
64
  desc 'Extract specified rows and columns from the file'
59
65
  command :extract do |c|
60
66
  c.desc 'Rows to extract'
61
- c.arg_name '1,2,10-30,45-EOF,REGEXP'
62
- c.flag [:r, :row], :must_match => /\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
67
+ c.arg_name '1,2,10-30,45-EOF,REGEXP,BEGINlogical_expressionEND'
68
+ c.flag [:r, :row], :must_match => row_regex
63
69
 
64
70
  c.desc 'Columns to extract'
65
71
  c.arg_name '1,2,10-30'
66
72
  c.flag [:c, :col], :must_match => /\d+(?:,\d+|-\d+)*/
67
73
 
74
+ c.desc 'Format of date values'
75
+ c.arg_name '%d.%m.%Y|%m/%d/%Y|...'
76
+ c.flag [:df]
77
+
68
78
  c.action do |global_options,options,args|
69
79
  print "Extracting ..."
70
- extractor = Sycsvpro::Extractor.new(infile: global_options[:f], outfile: global_options[:o],
71
- rows: options[:r], cols: options[:c])
80
+ extractor = Sycsvpro::Extractor.new(infile: global_options[:f],
81
+ outfile: global_options[:o],
82
+ rows: options[:r],
83
+ cols: options[:c],
84
+ df: options[:df])
72
85
  extractor.execute
73
86
  puts "done"
74
87
  end
@@ -79,16 +92,23 @@ command :collect do |c|
79
92
 
80
93
  c.desc 'Rows to consider for collection'
81
94
  c.arg_name 'ROW1,ROW2,ROW10-ROW30,45-EOF,REGEXP'
82
- c.flag [:r, :row], :must_match => /\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
95
+ c.flag [:r, :row], :must_match => row_regex #/\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
83
96
 
84
97
  c.desc 'Columns to collect values from'
85
98
  c.arg_name 'CATEGORY1:COL1,COL2,COL10-COL30+CATEGORY2:COL3-COL9'
86
99
  c.flag [:c, :col], :must_match => /^\w*:\d+(?:,\d+|-\d+|\+\w*:\d+(?:,\d+|-\d+)*)*/
87
100
 
101
+ c.desc 'Format of date values'
102
+ c.arg_name '%d.%m.%Y|%m/%d/%Y|...'
103
+ c.flag [:df]
104
+
88
105
  c.action do |global_options,options,args|
89
106
  print "Collecting ..."
90
- collector = Sycsvpro::Collector.new(infile: global_options[:f], outfile: global_options[:o],
91
- rows: options[:r], cols: options[:c])
107
+ collector = Sycsvpro::Collector.new(infile: global_options[:f],
108
+ outfile: global_options[:o],
109
+ rows: options[:r],
110
+ cols: options[:c],
111
+ df: options[:df])
92
112
  collector.execute
93
113
  puts "done"
94
114
  end
@@ -98,7 +118,7 @@ desc 'Allocate specified columns from the file to a key value'
98
118
  command :allocate do |c|
99
119
  c.desc 'Rows to consider'
100
120
  c.arg_name '1,2,10-30,45-EOF,REGEXP'
101
- c.flag [:r, :row], :must_match => /\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
121
+ c.flag [:r, :row], :must_match => row_regex #/\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
102
122
 
103
123
  c.desc 'Key to allocate columns to'
104
124
  c.arg_name '0'
@@ -108,10 +128,18 @@ command :allocate do |c|
108
128
  c.arg_name '1,2,10-30'
109
129
  c.flag [:c, :col], :must_match => /\d+(?:,\d+|-\d+)*/
110
130
 
131
+ c.desc 'Format of date values'
132
+ c.arg_name '%d.%m.%Y|%m/%d/%Y|...'
133
+ c.flag [:df]
134
+
111
135
  c.action do |global_options,options,args|
112
136
  print "Allocating ..."
113
- allocator = Sycsvpro::Allocator.new(infile: global_options[:f], outfile: global_options[:o],
114
- key: options[:k], rows: options[:r], cols: options[:c])
137
+ allocator = Sycsvpro::Allocator.new(infile: global_options[:f],
138
+ outfile: global_options[:o],
139
+ key: options[:k],
140
+ rows: options[:r],
141
+ cols: options[:c],
142
+ df: options[:df])
115
143
  allocator.execute
116
144
  puts "done"
117
145
  end
@@ -136,10 +164,10 @@ end
136
164
 
137
165
  desc 'Lists script or insert files in the scripts directory with optionally listing methods of script files'
138
166
  command :list do |c|
139
- c.desc 'Type of script (Ruby or insert file)'
167
+ c.desc 'Type of script (Ruby, insert or all files)'
140
168
  c.default_value 'script'
141
- c.arg_name 'SCRIPT|INSERT'
142
- c.flag [:t, :type], :must_match => /script|insert/i
169
+ c.arg_name 'SCRIPT|INSERT|ALL'
170
+ c.flag [:t, :type], :must_match => /script|insert|all/i
143
171
 
144
172
  c.desc 'Name of the script file'
145
173
  c.arg_name 'SCRIPT_NAME.rb|INSERT_NAME.ins'
@@ -148,12 +176,19 @@ command :list do |c|
148
176
  c.desc 'Show methods'
149
177
  c.switch [:m, :method]
150
178
 
179
+ c.desc 'Show script directory'
180
+ c.switch [:d, :dir]
181
+
151
182
  c.action do |global_options,options,args|
152
- script_list = Sycsvpro::ScriptList.new(dir: script_directory, type: options[:t],
153
- script: options[:s], show_methods: options[:m])
183
+ script_list = Sycsvpro::ScriptList.new(dir: script_directory,
184
+ type: options[:t],
185
+ script: options[:s],
186
+ show_methods: options[:m])
154
187
 
155
188
  scripts = script_list.execute
156
189
 
190
+ puts "script directory: #{script_directory}" if options[:d]; puts
191
+
157
192
  if scripts.empty?
158
193
  help_now! "No scripts available. You can create scripts with the edit command"
159
194
  else
@@ -194,11 +229,11 @@ command :count do |c|
194
229
 
195
230
  c.desc 'Key columns that are assigned the count of column values'
196
231
  c.arg_name 'COLUMN:TITLE,COLUMN:TITLE'
197
- c.flag [:k, :key], :must_match => /^\d+:\w+(?:,\d+:\w+)*/
232
+ c.flag [:k, :key], :required => true, :must_match => /^\d+:\w+(?:,\d+:\w+)*/
198
233
 
199
234
  c.desc 'Rows to consider'
200
235
  c.arg_name '1,2,10-30,45-EOF,REGEXP'
201
- c.flag [:r, :row], :must_match => /\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
236
+ c.flag [:r, :row], :must_match => row_regex #/\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
202
237
 
203
238
  c.desc 'Columns to count where columns 2 and 3 are counted conditionally'
204
239
  c.arg_name '1,2:<14.2.2014,10-30,3:>10'
@@ -212,11 +247,19 @@ command :count do |c|
212
247
  c.arg_name '%d.%m.%Y|%m/%d/%Y|...'
213
248
  c.flag [:df]
214
249
 
250
+ c.desc 'Sort headline values'
251
+ c.switch [:sort], :default_value => true
252
+
215
253
  c.action do |global_options,options,args|
216
254
  print "Counting..."
217
- counter = Sycsvpro::Counter.new(infile: global_options[:f], outfile: global_options[:o],
218
- key: options[:k], rows: options[:r], cols: options[:c],
219
- df: options[:df], sum: options[:s])
255
+ counter = Sycsvpro::Counter.new(infile: global_options[:f],
256
+ outfile: global_options[:o],
257
+ key: options[:k],
258
+ rows: options[:r],
259
+ cols: options[:c],
260
+ df: options[:df],
261
+ sum: options[:s],
262
+ sort: options[:sort])
220
263
  counter.execute
221
264
  puts "done"
222
265
  end
@@ -229,7 +272,7 @@ command :aggregate do |c|
229
272
 
230
273
  c.desc 'Rows to consider'
231
274
  c.arg_name '1,2,10-30,45-EOF,REGEXP'
232
- c.flag [:r, :row], :must_match => /\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
275
+ c.flag [:r, :row], :must_match => row_regex #/\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
233
276
 
234
277
  c.desc 'Columns to count'
235
278
  c.arg_name '1,2-4'
@@ -240,10 +283,18 @@ command :aggregate do |c|
240
283
  c.arg_name 'SUM_ROW_TITLE:ROW,SUM_COL_TITLE'
241
284
  c.flag [:s, :sum], :must_match => /^\w+:\d+(?:,\w+)?|^\w+/
242
285
 
286
+ c.desc 'Format of date values'
287
+ c.arg_name '%d.%m.%Y|%m/%d/%Y|...'
288
+ c.flag [:df]
289
+
243
290
  c.action do |global_options,options,args|
244
291
  print "Aggregating..."
245
- aggregator = Sycsvpro::Aggregator.new(infile: global_options[:f], outfile: global_options[:o],
246
- rows: options[:r], cols: options[:c], sum: options[:s])
292
+ aggregator = Sycsvpro::Aggregator.new(infile: global_options[:f],
293
+ outfile: global_options[:o],
294
+ rows: options[:r],
295
+ cols: options[:c],
296
+ sum: options[:s],
297
+ df: options[:df])
247
298
  aggregator.execute
248
299
  puts "done"
249
300
  end
@@ -254,7 +305,7 @@ desc 'Sort rows based on column values'
254
305
  command :sort do |c|
255
306
  c.desc 'Rows to consider'
256
307
  c.arg_name '1,2,10-30,45-EOF,REGEXP'
257
- c.flag [:r, :row], :must_match => /\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
308
+ c.flag [:r, :row], :must_match => row_regex #/\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
258
309
 
259
310
  c.desc 'Columns to sort based on a type (n = number, s = string, d = date) and its value'
260
311
  c.arg_name 'n:1,s:2-5,d:7'
@@ -310,18 +361,27 @@ arg_name 'MAPPINGS-FILE'
310
361
  command :map do |c|
311
362
  c.desc 'Rows to consider'
312
363
  c.arg_name 'ROW1,ROW2,ROW10-ROW30,45-EOF,REGEXP'
313
- c.flag [:r, :row], :must_match => /\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
364
+ c.flag [:r, :row], :must_match => row_regex #/\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
314
365
 
315
366
  c.desc 'Columns to consider for mapping'
316
367
  c.arg_name 'COL1,COL2,COL10-COL30'
317
368
  c.flag [:c, :col], :must_match => /\d+(?:,\d+|-\d+)*/
369
+
370
+ c.desc 'Format of date values'
371
+ c.arg_name '%d.%m.%Y|%m/%d/%Y|...'
372
+ c.default_value '%Y-%m-%d'
373
+ c.flag [:df]
318
374
 
319
375
  c.action do |global_options,options,args|
320
376
  help_now! "You need to provide a mapping file" if args.size == 0
321
377
 
322
378
  print "Mapping..."
323
- mapper = Sycsvpro::Mapper.new(infile: global_options[:f], outfile: global_options[:o],
324
- mapping: args[0], rows: options[:r], cols: options[:c])
379
+ mapper = Sycsvpro::Mapper.new(infile: global_options[:f],
380
+ outfile: global_options[:o],
381
+ mapping: args[0],
382
+ rows: options[:r],
383
+ cols: options[:c],
384
+ df: options[:df])
325
385
  mapper.execute
326
386
  puts "done"
327
387
  end
@@ -336,9 +396,9 @@ command :calc do |c|
336
396
  default_value '*'
337
397
  c.flag [:h, :header], :must_match => /\*(?:,\w+)*/
338
398
 
339
- c.desc 'Columns to consider for calculations'
399
+ c.desc 'Rows to consider for calculations'
340
400
  c.arg_name 'ROW1,ROW2-ROW10,45-EOF,REGEXP'
341
- c.flag [:r, :row], :must_match => /\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
401
+ c.flag [:r, :row], :must_match => row_regex #/\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
342
402
 
343
403
  c.desc 'Column to do calculations on'
344
404
  c.arg_name 'COL1:*2,COL2:-C3,COL3:*2+(4+C5),COL6:NEW_COL=C1+5'
@@ -356,9 +416,13 @@ command :calc do |c|
356
416
  help_now! "You need to provide the column flag" if options[:c].nil?
357
417
 
358
418
  print "Calculating..."
359
- calculator = Sycsvpro::Calculator.new(infile: global_options[:f], outfile: global_options[:o],
360
- header: options[:h], rows: options[:r], cols: options[:c],
361
- sum: options[:s], df: options[:df])
419
+ calculator = Sycsvpro::Calculator.new(infile: global_options[:f],
420
+ outfile: global_options[:o],
421
+ header: options[:h],
422
+ rows: options[:r],
423
+ cols: options[:c],
424
+ sum: options[:s],
425
+ df: options[:df])
362
426
  calculator.execute
363
427
  puts "done"
364
428
  end
@@ -41,7 +41,7 @@ module Sycsvpro
41
41
  @infile = options[:infile]
42
42
  @outfile = options[:outfile]
43
43
  @headerless = options[:headerless] || false
44
- @row_filter = RowFilter.new(options[:rows])
44
+ @row_filter = RowFilter.new(options[:rows], df: options[:df])
45
45
  @col_filter = ColumnFilter.new(options[:cols], df: options[:df])
46
46
  @key_values = Hash.new(0)
47
47
  @heading = []
@@ -19,8 +19,8 @@ module Sycsvpro
19
19
  def initialize(options={})
20
20
  @infile = options[:infile]
21
21
  @outfile = options[:outfile]
22
- @key_filter = ColumnFilter.new(options[:key])
23
- @row_filter = RowFilter.new(options[:rows])
22
+ @key_filter = ColumnFilter.new(options[:key], df: options[:df])
23
+ @row_filter = RowFilter.new(options[:rows], df: options[:df])
24
24
  @col_filter = ColumnFilter.new(options[:cols])
25
25
  end
26
26
 
@@ -37,7 +37,7 @@ module Sycsvpro
37
37
  @infile = options[:infile]
38
38
  @outfile = options[:outfile]
39
39
  @date_format = options[:df] || "%Y-%m-%d"
40
- @row_filter = RowFilter.new(options[:rows])
40
+ @row_filter = RowFilter.new(options[:rows], df: options[:df])
41
41
  @header = Header.new(options[:header])
42
42
  @sum_row = []
43
43
  @add_sum_row = options[:sum] || false
@@ -20,7 +20,7 @@ module Sycsvpro
20
20
  def initialize(options={})
21
21
  @infile = options[:infile]
22
22
  @outfile = options[:outfile]
23
- @row_filter = RowFilter.new(options[:rows])
23
+ @row_filter = RowFilter.new(options[:rows], df: options[:df])
24
24
  @collection = {}
25
25
  init_collection(options[:cols])
26
26
  end
@@ -27,6 +27,8 @@ module Sycsvpro
27
27
  attr_reader :key_values
28
28
  # header of the out file
29
29
  attr_reader :heading
30
+ # indicates whether the headline values should be sorted
31
+ attr_reader :heading_sort
30
32
  # Title of the sum row
31
33
  attr_reader :sum_row_title
32
34
  # row where to add the sums of the columns
@@ -39,15 +41,16 @@ module Sycsvpro
39
41
  # Creates a new counter. Takes as attributes infile, outfile, key, rows, cols, date-format and
40
42
  # indicator whether to add a sum row
41
43
  def initialize(options={})
42
- @infile = options[:infile]
43
- @outfile = options[:outfile]
44
+ @infile = options[:infile]
45
+ @outfile = options[:outfile]
44
46
  init_key_columns(options[:key])
45
- @row_filter = RowFilter.new(options[:rows])
46
- @col_filter = ColumnFilter.new(options[:cols], df: options[:df])
47
- @key_values = {}
48
- @heading = []
47
+ @row_filter = RowFilter.new(options[:rows], df: options[:df])
48
+ @col_filter = ColumnFilter.new(options[:cols], df: options[:df])
49
+ @key_values = {}
50
+ @heading = []
51
+ @heading_sort = options[:sort].nil? ? true : options[:sort]
49
52
  init_sum_scheme(options[:sum])
50
- @sums = Hash.new(0)
53
+ @sums = Hash.new(0)
51
54
  end
52
55
 
53
56
  # Executes the counter
@@ -82,17 +85,18 @@ module Sycsvpro
82
85
  # Writes the count results
83
86
  def write_result
84
87
  sum_line = [sum_row_title] + [''] * (key_titles.size - 1)
85
- heading.sort.each do |h|
88
+ headline = heading_sort ? heading.sort : col_filter.pivot.keys
89
+ headline.each do |h|
86
90
  sum_line << sums[h]
87
91
  end
88
92
  row = 0;
89
93
  File.open(outfile, 'w') do |out|
90
94
  out.puts sum_line.join(';') if row == sum_row ; row += 1
91
- out.puts (key_titles + heading.sort).join(';')
95
+ out.puts (key_titles + headline).join(';')
92
96
  key_values.each do |k,v|
93
97
  out.puts sum_line.join(';') if row == sum_row ; row += 1
94
98
  line = [k]
95
- heading.sort.each do |h|
99
+ headline.each do |h|
96
100
  line << v[:elements][h] unless h == sum_col_title
97
101
  end
98
102
  line << v[:sum] unless sum_col_title.nil?
@@ -20,8 +20,8 @@ module Sycsvpro
20
20
  def initialize(options={})
21
21
  @in_file = options[:infile]
22
22
  @out_file = options[:outfile]
23
- @row_filter = RowFilter.new(options[:rows])
24
- @col_filter = ColumnFilter.new(options[:cols])
23
+ @row_filter = RowFilter.new(options[:rows], df: options[:df])
24
+ @col_filter = ColumnFilter.new(options[:cols], df: options[:df])
25
25
  end
26
26
 
27
27
  # Executes the extractor
@@ -11,6 +11,8 @@ module Sycsvpro
11
11
  attr_reader :date_format
12
12
  # Filter for rows and columns
13
13
  attr_reader :filter
14
+ # Boolean for rows
15
+ attr_reader :boolean_filter
14
16
  # Type of column (n = number, s = string)
15
17
  attr_reader :types
16
18
  # Pattern that is used as a filter
@@ -25,21 +27,32 @@ module Sycsvpro
25
27
  @types = []
26
28
  @pattern = []
27
29
  @pivot = {}
30
+ @boolean_filter = ""
28
31
  create_filter(values)
29
32
  end
30
33
 
31
34
  # Creates the filters based on the given patterns
32
35
  def method_missing(id, *args, &block)
36
+ boolean_row_regex = %r{
37
+ BEGIN(\(*[nsd]\d+[<!=~>]{1,2}
38
+ (?:[A-Z][A-Za-z]*\.new\(.*?\)|\d+|['"].*?['"])
39
+ (?:\)*(?:&&|\|\||$)
40
+ \(*[nsd]\d+[<!=~>]{1,2}
41
+ (?:[A-Z][A-Za-z]*\.new\(.*?\)|\d+|['"].*?['"])\)*)*)END
42
+ }xi
43
+
44
+ return boolean_row($1, args, block) if id =~ boolean_row_regex
33
45
  return equal($1, args, block) if id =~ /^(\d+)$/
34
46
  return equal_type($1, $2, args, block) if id =~ /^(s|n|d):(\d+)$/
35
47
  return range($1, $2, args, block) if id =~ /^(\d+)-(\d+)$/
36
48
  return range_type($1, $2, $3, args, block) if id =~ /^(s|n|d):(\d+)-(\d+)$/
37
49
  return regex($1, args, block) if id =~ /^\/(.*)\/$/
38
50
  return col_regex($1, $2, args, block) if id =~ /^(\d+):\/(.*)\/$/
39
- return date($1, $2, $3, args, block) if id =~ /^(\d+):(<|=|>)(\d+.\d+.\d+)/
51
+ return date($1, $2, $3, args, block) if id =~ /^(\d+):(<|=|>)(\d+.\d+.\d+)$/
40
52
  return date_range($1, $2, $3, args, block) if id =~ /^(\d+):(\d+.\d+.\d+.)-(\d+.\d+.\d+)$/
41
- return number($1, $2, $3, args, block) if id =~ /^(\d+):(<|=|>)(\d+)/
42
- return number_range($1, $2, $3, args, block) if id =~ /^(\d):(\d+)-(\d+)/
53
+ return number($1, $2, $3, args, block) if id =~ /^(\d+):(<|=|>)(\d+)$/
54
+ return number_range($1, $2, $3, args, block) if id =~ /^(\d):(\d+)-(\d+)$/
55
+
43
56
  super
44
57
  end
45
58
 
@@ -48,6 +61,44 @@ module Sycsvpro
48
61
  raise 'Needs to be overridden by sub class'
49
62
  end
50
63
 
64
+ # Checks whether the values match the boolean filter
65
+ def match_boolean_filter?(values=[])
66
+ return false if boolean_filter.empty? or values.empty?
67
+ expression = boolean_filter
68
+ columns = expression.scan(/(([nsd])(\d+))([<!=~>]{1,2})(.*?)(?:[\|&]{2}|$)/)
69
+ # STDERR.puts "expr = #{expression.inspect}"
70
+ # STDERR.puts "vals = #{values.inspect}"
71
+ # STDERR.puts "cols = #{columns.inspect}"
72
+ columns.each do |c|
73
+ # STDERR.puts "val = #{values[c[2].to_i].inspect}"
74
+ value = case c[1]
75
+ when 'n'
76
+ values[c[2].to_i].empty? ? '0' : values[c[2].to_i]
77
+ when 's'
78
+ "'#{values[c[2].to_i]}'"
79
+ when 'd'
80
+ begin
81
+ Date.strptime(values[c[2].to_i], date_format)
82
+ rescue Exception => e
83
+ case c[3]
84
+ when '<', '<=', '=='
85
+ "#{c[4]}+1"
86
+ when '>', '>='
87
+ '0'
88
+ when '!='
89
+ c[4]
90
+ end
91
+ else
92
+ "Date.strptime('#{values[c[2].to_i]}', '#{date_format}')"
93
+ end
94
+ end
95
+ expression = expression.gsub(c[0], value)
96
+ # STDERR.puts "val2 = #{value}"
97
+ end
98
+ # STDERR.puts "exp = #{expression.inspect}"
99
+ eval(expression)
100
+ end
101
+
51
102
  # Yields the column value and whether the filter matches the column
52
103
  def pivot_each_column(values=[])
53
104
  pivot.each do |column, parameters|
@@ -65,14 +116,17 @@ module Sycsvpro
65
116
 
66
117
  # Checks whether a filter has been set. Returns true if filter has been set otherwise false
67
118
  def has_filter?
68
- return !(filter.empty? and pattern.empty?)
119
+ return !(filter.empty? and pattern.empty? and boolean_filter.empty?)
69
120
  end
70
121
 
71
122
  private
72
123
 
73
- # Creates a filter based on the provided rows and columns
124
+ # Creates a filter based on the provided rows and columns select criteria
74
125
  def create_filter(values)
75
- values.split(',').each { |f| send(f) } unless values.nil?
126
+ values.scan(/(?<=,|^)(BEGIN.*?END|\/.*?\/|.*?)(?=,|$)/i).flatten.each do |value|
127
+ # STDERR.puts "value = #{value}"
128
+ send(value)
129
+ end unless values.nil?
76
130
  end
77
131
 
78
132
  # Adds a single value to the filter
@@ -110,6 +164,11 @@ module Sycsvpro
110
164
  pivot[r] = { col: col, operation: operation }
111
165
  end
112
166
 
167
+ # Adds a boolean row filter
168
+ def boolean_row(operation, args, block)
169
+ boolean_filter.clear << operation
170
+ end
171
+
113
172
  # Adds a date filter
114
173
  def date(col, comparator, date, args, block)
115
174
  comparator = '==' if comparator == '='
@@ -19,8 +19,8 @@ module Sycsvpro
19
19
  def initialize(options={})
20
20
  @infile = options[:infile]
21
21
  @outfile = options[:outfile]
22
- @row_filter = RowFilter.new(options[:row_filter])
23
- @col_filter = ColumnFilter.new(options[:col_filter])
22
+ @row_filter = RowFilter.new(options[:row_filter], df: options[:df])
23
+ @col_filter = ColumnFilter.new(options[:col_filter], df: options[:df])
24
24
  @mapper = {}
25
25
  init_mapper(options[:mapping])
26
26
  end
@@ -17,9 +17,10 @@ module Sycsvpro
17
17
  pattern.each do |p|
18
18
  filtered = (filtered or !(object =~ Regexp.new(p)).nil?)
19
19
  end
20
+ filtered = (filtered or match_boolean_filter?(object.split(';')))
20
21
  filtered ? object : nil
21
22
  end
22
23
 
23
24
  end
24
-
25
+
25
26
  end
@@ -21,6 +21,7 @@ module Sycsvpro
21
21
  @script_type.downcase!
22
22
  @script_file = options[:script] || '*.rb' if @script_type == 'script'
23
23
  @script_file = options[:script] || '*.ins' if @script_type == 'insert'
24
+ @script_file = options[:script] || '*.{rb,ins}' if @script_type == 'all'
24
25
  @show_methods = options[:show_methods] if @script_type == 'script'
25
26
  @show_methods = false if @script_type == 'insert'
26
27
  @list = {}
@@ -32,7 +32,7 @@ module Sycsvpro
32
32
  @outfile = options[:outfile]
33
33
  @headerless = options[:headerless] || false
34
34
  @desc = options[:desc] || false
35
- @row_filter = RowFilter.new(options[:rows])
35
+ @row_filter = RowFilter.new(options[:rows], df: options[:df])
36
36
  @col_type_filter = ColumnTypeFilter.new(options[:cols], df: options[:df])
37
37
  @sorted_rows = []
38
38
  end
@@ -1,5 +1,5 @@
1
1
  # Operating csv files
2
2
  module Sycsvpro
3
3
  # Version number of sycsvpro
4
- VERSION = '0.1.1'
4
+ VERSION = '0.1.2'
5
5
  end
@@ -33,15 +33,15 @@ module Sycsvpro
33
33
  it "should count date columns" do
34
34
  counter = Counter.new(infile: @in_file, outfile: @out_file, rows: "1-10",
35
35
  cols: "2:<1.1.2013,2:1.1.2013-31.12.2014,2:>31.12.2014",
36
- key: "0:customer", df: "%d.%m.%Y")
36
+ key: "0:customer", df: "%d.%m.%Y", sort: false)
37
37
 
38
38
  counter.execute
39
39
 
40
- result = [ "customer;1.1.2013-31.12.2014;<1.1.2013;>31.12.2014",
40
+ result = [ "customer;<1.1.2013;1.1.2013-31.12.2014;>31.12.2014",
41
41
  "Fink;0;0;2",
42
- "Haas;0;1;0",
43
- "Gent;1;0;0",
44
- "Rank;1;0;0" ]
42
+ "Haas;1;0;0",
43
+ "Gent;0;1;0",
44
+ "Rank;0;1;0" ]
45
45
 
46
46
  File.open(@out_file).each_with_index do |line, index|
47
47
  line.chomp.should eq result[index]
@@ -5,7 +5,8 @@ module Sycsvpro
5
5
  describe Extractor do
6
6
 
7
7
  before do
8
- @in_file = File.join(File.dirname(__FILE__), "files/in.csv")
8
+ @in_file = File.join(File.dirname(__FILE__), "files/in.csv")
9
+ @in_file2 = File.join(File.dirname(__FILE__), "files/in4.csv")
9
10
  @out_file = File.join(File.dirname(__FILE__), "files/out.csv")
10
11
  end
11
12
 
@@ -37,6 +38,20 @@ module Sycsvpro
37
38
 
38
39
  end
39
40
 
41
+ it "should extract rows base on regex including commas" do
42
+ extractor = Extractor.new(infile: @in_file2, outfile: @out_file, rows: "/[56789]\\d+|\\d{3,}/")
43
+
44
+ extractor.execute
45
+
46
+ result = [ "Gent;50",
47
+ "Haas;100",
48
+ "Klig;80" ]
49
+
50
+ File.open(@out_file).each_with_index do |line, index|
51
+ line.chomp.should eq result[index]
52
+ end
53
+ end
54
+
40
55
  end
41
56
 
42
57
  end
@@ -0,0 +1,76 @@
1
+ require 'sycsvpro/row_filter'
2
+
3
+ module Sycsvpro
4
+
5
+ describe RowFilter do
6
+
7
+ before do
8
+ @in_file = File.join(File.dirname(__FILE__), "files/in.csv")
9
+ @out_file = File.join(File.dirname(__FILE__), "files/out.csv")
10
+ end
11
+
12
+ it "should return row string when no filter is set" do
13
+ row_filter = Sycsvpro::RowFilter.new(nil)
14
+ row_filter.process("abc", row: 1).should eq "abc"
15
+ end
16
+
17
+ it "should filter rows on index" do
18
+ rows = "1-5"
19
+ row_filter = Sycsvpro::RowFilter.new(rows)
20
+ row_filter.process("abc", row: 1).should eq "abc"
21
+ row_filter.process("abc", row: 6).should be_nil
22
+ end
23
+
24
+ it "should filter rows on regex" do
25
+ rows = "1,\/\\d{2,}\/"
26
+ row_filter = Sycsvpro::RowFilter.new(rows)
27
+ row_filter.process("5;50;500", row: 1).should eq "5;50;500"
28
+ row_filter.process("5;50;500", row: 2).should eq "5;50;500"
29
+ end
30
+
31
+ it "should filter rows on logical expression" do
32
+ rows = "BEGINn1>50&&s2=='Ruby'||n3<10END"
33
+ row_filter = Sycsvpro::RowFilter.new(rows)
34
+ row_filter.process("a;49;Rub;9").should eq "a;49;Rub;9"
35
+ row_filter.process("a;51;Ruby;11").should eq "a;51;Ruby;11"
36
+ row_filter.process("a;49;Ruby;11").should be_nil
37
+ end
38
+
39
+ it "should filter rows on Ruby classes" do
40
+ rows = "BEGINn1==50&&d2==Date.new(2014,6,16)||s3=~Regexp.new('[56789]\\d{2,}')END"
41
+ row_filter = Sycsvpro::RowFilter.new(rows)
42
+ row_filter.process("x;50;2014-06-16;99").should eq "x;50;2014-06-16;99"
43
+ end
44
+
45
+ it "should filter rows on row number filter and boolean filter" do
46
+ rows = "1,3-4,BEGINn1==50&&d2<Date.new(2014,6,16)||s3=='Works?'END"
47
+ row_filter = Sycsvpro::RowFilter.new(rows)
48
+ row_filter.process("x;50;2014-06-15;Works?").should eq "x;50;2014-06-15;Works?"
49
+ row_filter.process("x;50;2014-06-15;Works?", row: 1).should eq "x;50;2014-06-15;Works?"
50
+ end
51
+
52
+ it "should filter rows on boolean filter with brackets" do
53
+ rows = "BEGINn1==50&&(d2<Date.new(2014,6,16)||s3=='Works?')END"
54
+ row_filter = Sycsvpro::RowFilter.new(rows)
55
+ row_filter.process("x;50;2014-6-15;Works?").should eq "x;50;2014-6-15;Works?"
56
+ row_filter.process("x;49;2014-6-15;Works?").should be_nil
57
+ row_filter.process("x;50;2014-6-17;Worx?").should be_nil
58
+ end
59
+
60
+ it "should fitler rows with ' in value" do
61
+ rows = "BEGINn1!=50||n2=~'/\\d+/'||n2==\"Doesn't work\"END"
62
+ row_filter = Sycsvpro::RowFilter.new(rows)
63
+ row_filter.process("x;50;2;we").should be_nil
64
+ row_filter.process("x;49;/\\d+/;\"Doesn't work\"").should eq "x;49;/\\d+/;Doesn't work"
65
+ end
66
+
67
+ it "should not filter rows with invalid syntax" do
68
+ rows = "BEGINn1!=50||n2=~regex('\\d+')END"
69
+ expect { Sycsvpro::RowFilter.new(rows) }.to raise_error
70
+ end
71
+
72
+ end
73
+
74
+ end
75
+
76
+
metadata CHANGED
@@ -1,74 +1,84 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: sycsvpro
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.1
4
+ version: 0.1.2
5
+ prerelease:
5
6
  platform: ruby
6
7
  authors:
7
8
  - Pierre Sugar
8
9
  autorequire:
9
10
  bindir: bin
10
11
  cert_chain: []
11
- date: 2014-03-12 00:00:00.000000000 Z
12
+ date: 2014-06-17 00:00:00.000000000 Z
12
13
  dependencies:
13
14
  - !ruby/object:Gem::Dependency
14
15
  name: rake
15
16
  requirement: !ruby/object:Gem::Requirement
17
+ none: false
16
18
  requirements:
17
- - - ">="
19
+ - - ! '>='
18
20
  - !ruby/object:Gem::Version
19
21
  version: '0'
20
22
  type: :development
21
23
  prerelease: false
22
24
  version_requirements: !ruby/object:Gem::Requirement
25
+ none: false
23
26
  requirements:
24
- - - ">="
27
+ - - ! '>='
25
28
  - !ruby/object:Gem::Version
26
29
  version: '0'
27
30
  - !ruby/object:Gem::Dependency
28
31
  name: rdoc
29
32
  requirement: !ruby/object:Gem::Requirement
33
+ none: false
30
34
  requirements:
31
- - - ">="
35
+ - - ! '>='
32
36
  - !ruby/object:Gem::Version
33
37
  version: '0'
34
38
  type: :development
35
39
  prerelease: false
36
40
  version_requirements: !ruby/object:Gem::Requirement
41
+ none: false
37
42
  requirements:
38
- - - ">="
43
+ - - ! '>='
39
44
  - !ruby/object:Gem::Version
40
45
  version: '0'
41
46
  - !ruby/object:Gem::Dependency
42
47
  name: aruba
43
48
  requirement: !ruby/object:Gem::Requirement
49
+ none: false
44
50
  requirements:
45
- - - ">="
51
+ - - ! '>='
46
52
  - !ruby/object:Gem::Version
47
53
  version: '0'
48
54
  type: :development
49
55
  prerelease: false
50
56
  version_requirements: !ruby/object:Gem::Requirement
57
+ none: false
51
58
  requirements:
52
- - - ">="
59
+ - - ! '>='
53
60
  - !ruby/object:Gem::Version
54
61
  version: '0'
55
62
  - !ruby/object:Gem::Dependency
56
63
  name: rspec
57
64
  requirement: !ruby/object:Gem::Requirement
65
+ none: false
58
66
  requirements:
59
- - - ">="
67
+ - - ! '>='
60
68
  - !ruby/object:Gem::Version
61
69
  version: '0'
62
70
  type: :development
63
71
  prerelease: false
64
72
  version_requirements: !ruby/object:Gem::Requirement
73
+ none: false
65
74
  requirements:
66
- - - ">="
75
+ - - ! '>='
67
76
  - !ruby/object:Gem::Version
68
77
  version: '0'
69
78
  - !ruby/object:Gem::Dependency
70
79
  name: gli
71
80
  requirement: !ruby/object:Gem::Requirement
81
+ none: false
72
82
  requirements:
73
83
  - - '='
74
84
  - !ruby/object:Gem::Version
@@ -76,6 +86,7 @@ dependencies:
76
86
  type: :runtime
77
87
  prerelease: false
78
88
  version_requirements: !ruby/object:Gem::Requirement
89
+ none: false
79
90
  requirements:
80
91
  - - '='
81
92
  - !ruby/object:Gem::Version
@@ -89,8 +100,8 @@ extra_rdoc_files:
89
100
  - README.rdoc
90
101
  - sycsvpro.rdoc
91
102
  files:
92
- - ".gitignore"
93
- - ".rspec"
103
+ - .gitignore
104
+ - .rspec
94
105
  - Gemfile
95
106
  - Gemfile.lock
96
107
  - LICENSE
@@ -201,37 +212,39 @@ files:
201
212
  - spec/sycsvpro/inserter_spec.rb
202
213
  - spec/sycsvpro/mapper_spec.rb
203
214
  - spec/sycsvpro/profiler_spec.rb
215
+ - spec/sycsvpro/row_filter_spec.rb
204
216
  - spec/sycsvpro/script_list_spec.rb
205
217
  - spec/sycsvpro/sorter_spec.rb
206
218
  - sycsvpro.gemspec
207
219
  - sycsvpro.rdoc
208
220
  homepage: https://github.com/sugaryourcoffee/syc-svpro
209
221
  licenses: []
210
- metadata: {}
211
222
  post_install_message:
212
223
  rdoc_options:
213
- - "--title"
224
+ - --title
214
225
  - sycsvpro
215
- - "--main"
226
+ - --main
216
227
  - README.rdoc
217
- - "-ri"
228
+ - -ri
218
229
  require_paths:
219
230
  - lib
220
231
  - lib
221
232
  required_ruby_version: !ruby/object:Gem::Requirement
233
+ none: false
222
234
  requirements:
223
- - - ">="
235
+ - - ! '>='
224
236
  - !ruby/object:Gem::Version
225
237
  version: '0'
226
238
  required_rubygems_version: !ruby/object:Gem::Requirement
239
+ none: false
227
240
  requirements:
228
- - - ">="
241
+ - - ! '>='
229
242
  - !ruby/object:Gem::Version
230
243
  version: '0'
231
244
  requirements: []
232
245
  rubyforge_project:
233
- rubygems_version: 2.2.0
246
+ rubygems_version: 1.8.23
234
247
  signing_key:
235
- specification_version: 4
248
+ specification_version: 3
236
249
  summary: Processing of csv files
237
250
  test_files: []
checksums.yaml DELETED
@@ -1,7 +0,0 @@
1
- ---
2
- SHA1:
3
- metadata.gz: 7823aeea07dda43deb692fdf13a8f57559bbb917
4
- data.tar.gz: 4e3da29ada80c6c4a5eb282facce537e4750f0e3
5
- SHA512:
6
- metadata.gz: 203400b5c9187269c3fe336f940f25427eecfd5fe25a56fbc3fc982f7c47d059a6aefc5eb74b5ab5769d5a0b318ebf2ee72a231db64127cb24ca767ee8f5915f
7
- data.tar.gz: deb1cfff428e6e83f1770a2f67a7b4e6fcf5cc642ce6c96cd50ba748ecfb8e44aaa39b53b88bc08083e434857d2ab778a7d086f1e61d9de3b3d7c1568a9ce343