RubyGems - sycsvpro - Versions diffs - 0.1.1 → 0.1.2 - Mend

sycsvpro 0.1.1 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

data/Gemfile.lock +1 -1
data/README.md +58 -16
data/bin/sycsvpro +96 -32
data/lib/sycsvpro/aggregator.rb +1 -1
data/lib/sycsvpro/allocator.rb +2 -2
data/lib/sycsvpro/calculator.rb +1 -1
data/lib/sycsvpro/collector.rb +1 -1
data/lib/sycsvpro/counter.rb +14 -10
data/lib/sycsvpro/extractor.rb +2 -2
data/lib/sycsvpro/filter.rb +65 -6
data/lib/sycsvpro/mapper.rb +2 -2
data/lib/sycsvpro/row_filter.rb +2 -1
data/lib/sycsvpro/script_list.rb +1 -0
data/lib/sycsvpro/sorter.rb +1 -1
data/lib/sycsvpro/version.rb +1 -1
data/spec/sycsvpro/counter_spec.rb +5 -5
data/spec/sycsvpro/extractor_spec.rb +16 -1
data/spec/sycsvpro/row_filter_spec.rb +76 -0
metadata +33 -20
checksums.yaml +0 -7

data/Gemfile.lock CHANGED Viewed

@@ -1,7 +1,7 @@
 PATH
   remote: .
   specs:
-    sycsvpro (0.1.1)
+    sycsvpro (0.1.2)
       gli (= 2.9.0)
 GEM

data/README.md CHANGED Viewed

@@ -109,7 +109,9 @@ Allocate all the machine types to the customer
 Count
 -----
-Count all customers (key column) in rows 2 to 20 that have machines that start with *h* and have a contract valid beginning after 1.1.2000. Add a sum row with title Total at column 1
+Count all customers (key column) in rows 2 to 20 that have machines that start
+with *h* and have a contract valid beginning after 1.1.2000. Add a sum row with
+title Total at column 1
     $ sycsvpro -f in.csv -o out.csv count -r 2-20 -k 0:customer -c 1:/^h/,5:">1.1.2000" --df "%d.%m.%Y" -s "Total:1"
@@ -126,7 +128,8 @@ It is possible to use multiple key columns `-k 0:customer,1:machines`
 Aggregate
 ---------
-Aggregate row values and add the sum to the end of the row. In the example we aggregate the customer names.
+Aggregate row values and add the sum to the end of the row. In the example we
+aggregate the customer names.
     $ sycsvpro -f in.csv -o out.csv aggregate -c 0 -s Total:1,Sum
@@ -141,7 +144,8 @@ The aggregation result in out.csv is
 Calc
 ----
-Process arithmetic operations on the contract count and create a target column and a sum which is added at the end of the result file
+Process arithmetic operations on the contract count and create a target column
+and a sum which is added at the end of the result file
     $ sycsvpro -f in.csv -o out.csv calc -r 2-20 -h *,target -c 6:*2,7:target=c6*10
@@ -154,11 +158,13 @@ Process arithmetic operations on the contract count and create a target column a
     chiro;c2;con331;dri100;mot130;3.05.3010;2;20
     0;0;0;0;0;0;10;100
-In the sum row non-numbers in the colums are converted to 0. Therefore column 0 is summed up to 0 as all strings are converted to 0.
+In the sum row non-numbers in the colums are converted to 0. Therefore column 0
+is summed up to 0 as all strings are converted to 0.
 Sort
 ----
-Sort rows on specified columns as an example sort rows based on customer (string s) and contract date (date d)
+Sort rows on specified columns as an example sort rows based on customer
+(string s) and contract date (date d)
     $ sycsvpro -f in.csv -o out.csv sort -r 2-20 -c s:0,d:5
@@ -169,35 +175,44 @@ Sort rows on specified columns as an example sort rows based on customer (string
     chiro;c2;con331;dri100;mot130;3.05.3010;1
     chiro;c1;con333;dri110;mot100;1.10.3011;1
-Sort expects the first non-empty row as the header row. If --headerless switch is set then sort assumes no header being available.
+Sort expects the first non-empty row as the header row. If --headerless switch
+is set then sort assumes no header being available.
 Insert
 ------
-Add rows at the bottom or on top of a file. The command below adds the content of the file file-with-rows-to-insert.text on top of the file in.csv and saves it to out.csv
+Add rows at the bottom or on top of a file. The command below adds the content
+of the file file-with-rows-to-insert.text on top of the file in.csv and saves
+it to out.csv
     $ sycsvpro -f in.csv -o out.csv insert file-with-rows-to-insert.txt -p top
 Edit
 ----
-Creates or if it exists opens a file for editing. The file is created in the directory ~/.syc/sycsvpro/scripts. Following command creates a Ruby script with the name script.rb and a method call_me
+Creates or if it exists opens a file for editing. The file is created in the
+directory ~/.syc/sycsvpro/scripts. Following command creates a Ruby script with
+the name script.rb and a method call_me
     $ sycsvpro edit -s script.rb -m call_me
 List
 ----
-List the scripts or insert-file available in the scripts directory
+List the scripts, insert-file or all scripts available in the scripts directory
+which is also displayed
+    script directory: ~/.syc/sycsvpro/scripts
     $ sycsvpro list -m
     script.rb
       call_me
 Execute
 -------
-Execute takes a Ruby script file as an argument and processes the script. The following command executes the script *script.rb* and invokes the method *calc*
+Execute takes a Ruby script file as an argument and processes the script. The
+following command executes the script *script.rb* and invokes the method *calc*
     $ sycsvpro execute ./script.rb calc
-Below is an example script file that is ultimately doing the same as the count command
+Below is an example script file that is ultimately doing the same as the count
+command
     $ sycsvpro -f in.csv -o out.csv count -r 1-20 -k 0 -c 4,5
@@ -232,15 +247,20 @@ def calc
 end
 ```
-*rows* and *write_to* are convenience methods provided by sycsvpro that can be used in script files to operate on files.
+*rows* and *write_to* are convenience methods provided by sycsvpro that can be
+used in script files to operate on files.
-*rows* will return values at the specified columns in the order they are provided in the call to
-rows. The columns to be returned in the block have to end with _column_ or _columns_ dependent if a value or an array should be returned. You can find the *rows* and *write_to* methods at _lib/sycsvpro/dsl.rb_.
+*rows* will return values at the specified columns in the order they are
+provided in the call to rows. The columns to be returned in the block have to
+end with _column_ or _columns_ dependent if a value or an array should be
+returned. You can find the *rows* and *write_to* methods at
+_lib/sycsvpro/dsl.rb_.
 Working with sycsvpro
 =====================
-sycsvpro emerged from my daily work when cleaning and anaylzing data. If you want to dig deeper I would recommend [R](http://www.r-project.org/).
+sycsvpro emerged from my daily work when cleaning and anaylzing data. If you
+want to dig deeper I would recommend [R](http://www.r-project.org/).
 A work flow could be as follows
@@ -251,7 +271,29 @@ A work flow could be as follows
 * Do arithmetic operations on the values `calc`
 * Sort the rows based on column values
-When I have analyzed the data I use _Microsoft Excel_ or _LibreOffice Calc_ to create nice graphs. To create more sophisiticated analysis *R* is the right tool to use.
+When I have analyzed the data I use _Microsoft Excel_ or _LibreOffice Calc_ to
+create nice graphs. To create more sophisiticated analysis *R* is the right tool
+to use.
+Release notes
+=============
+Version 0.1.2
+-------------
+* Now it is possible to have , in the filter as non separating values. You can
+now define filter like 1-2,4,/[56789]{2,}/,10
+* Filtering rows on boolean expression based on values contained in columns.
+  The boolean expression has to be enclosed between BEGIN and END
+  Example:
+    -r BEGINs0=='Ruby'&&n1<1||d2==Date.new(2014,6,17)END
+    s0 - string in column 0
+    n1 - number in column 1
+    d2 - date   in column 2
+* ``list`` shows the directory of the script file and has the flag *all* to
+show all scripts, that is _insert files_ and _Ruby files_
+* When counting columns with *count* the column headers are sorted
+alphabetically. No it is possible to set ``sort: false`` to keep the column
+headers in the sequence they are specified
 Installation
 ============

data/bin/sycsvpro CHANGED Viewed

@@ -11,6 +11,12 @@ end
 include GLI::App
+row_regex = %r{
+  \d+(?:,\d+|-\d+|-eof|,\/.*\/)*|
+  \/.*\/(?:,\/.*\/|\d+)*|
+  BEGIN.*?END
+}xi
 # Directory holding configuration files
 sycsvpro_directory = File.expand_path("~/.syc/sycsvpro")
@@ -58,17 +64,24 @@ end
 desc 'Extract specified rows and columns from the file'
 command :extract do |c|
   c.desc 'Rows to extract'
-  c.arg_name '1,2,10-30,45-EOF,REGEXP'
-  c.flag [:r, :row], :must_match => /\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
+  c.arg_name '1,2,10-30,45-EOF,REGEXP,BEGINlogical_expressionEND'
+  c.flag [:r, :row], :must_match => row_regex
   c.desc 'Columns to extract'
   c.arg_name '1,2,10-30'
   c.flag [:c, :col], :must_match => /\d+(?:,\d+|-\d+)*/
+  c.desc 'Format of date values'
+  c.arg_name '%d.%m.%Y|%m/%d/%Y|...'
+  c.flag [:df]
   c.action do |global_options,options,args|
     print "Extracting ..."
-    extractor = Sycsvpro::Extractor.new(infile: global_options[:f], outfile: global_options[:o],
-                                        rows: options[:r], cols: options[:c])
+    extractor = Sycsvpro::Extractor.new(infile:  global_options[:f],
+                                        outfile: global_options[:o],
+                                        rows:    options[:r],
+                                        cols:    options[:c],
+                                        df:      options[:df])
     extractor.execute
     puts "done"
   end
@@ -79,16 +92,23 @@ command :collect do |c|
   c.desc 'Rows to consider for collection'
   c.arg_name 'ROW1,ROW2,ROW10-ROW30,45-EOF,REGEXP'
-  c.flag [:r, :row], :must_match => /\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
+  c.flag [:r, :row], :must_match => row_regex #/\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
   c.desc 'Columns to collect values from'
   c.arg_name 'CATEGORY1:COL1,COL2,COL10-COL30+CATEGORY2:COL3-COL9'
   c.flag [:c, :col], :must_match => /^\w*:\d+(?:,\d+|-\d+|\+\w*:\d+(?:,\d+|-\d+)*)*/
+  c.desc 'Format of date values'
+  c.arg_name '%d.%m.%Y|%m/%d/%Y|...'
+  c.flag [:df]
   c.action do |global_options,options,args|
     print "Collecting ..."
-    collector = Sycsvpro::Collector.new(infile: global_options[:f], outfile: global_options[:o],
-                                        rows: options[:r], cols: options[:c])
+    collector = Sycsvpro::Collector.new(infile:  global_options[:f],
+                                        outfile: global_options[:o],
+                                        rows:    options[:r],
+                                        cols:    options[:c],
+                                        df:      options[:df])
     collector.execute
     puts "done"
   end
@@ -98,7 +118,7 @@ desc 'Allocate specified columns from the file to a key value'
 command :allocate do |c|
   c.desc 'Rows to consider'
   c.arg_name '1,2,10-30,45-EOF,REGEXP'
-  c.flag [:r, :row], :must_match => /\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
+  c.flag [:r, :row], :must_match => row_regex #/\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
   c.desc 'Key to allocate columns to'
   c.arg_name '0'
@@ -108,10 +128,18 @@ command :allocate do |c|
   c.arg_name '1,2,10-30'
   c.flag [:c, :col], :must_match => /\d+(?:,\d+|-\d+)*/
+  c.desc 'Format of date values'
+  c.arg_name '%d.%m.%Y|%m/%d/%Y|...'
+  c.flag [:df]
   c.action do |global_options,options,args|
     print "Allocating ..."
-    allocator = Sycsvpro::Allocator.new(infile: global_options[:f], outfile: global_options[:o],
-                                        key: options[:k], rows: options[:r], cols: options[:c])
+    allocator = Sycsvpro::Allocator.new(infile:  global_options[:f],
+                                        outfile: global_options[:o],
+                                        key:     options[:k],
+                                        rows:    options[:r],
+                                        cols:    options[:c],
+                                        df:      options[:df])
     allocator.execute
     puts "done"
   end
@@ -136,10 +164,10 @@ end
 desc 'Lists script or insert files in the scripts directory with optionally listing methods of script files'
 command :list do |c|
-  c.desc 'Type of script (Ruby or insert file)'
+  c.desc 'Type of script (Ruby, insert or all files)'
   c.default_value 'script'
-  c.arg_name 'SCRIPT|INSERT'
-  c.flag [:t, :type], :must_match => /script|insert/i
+  c.arg_name 'SCRIPT|INSERT|ALL'
+  c.flag [:t, :type], :must_match => /script|insert|all/i
   c.desc 'Name of the script file'
   c.arg_name 'SCRIPT_NAME.rb|INSERT_NAME.ins'
@@ -148,12 +176,19 @@ command :list do |c|
   c.desc 'Show methods'
   c.switch [:m, :method]
+  c.desc 'Show script directory'
+  c.switch [:d, :dir]
   c.action do |global_options,options,args|
-    script_list = Sycsvpro::ScriptList.new(dir: script_directory, type: options[:t],
-                                           script: options[:s], show_methods: options[:m])
+    script_list = Sycsvpro::ScriptList.new(dir: script_directory,
+                                           type: options[:t],
+                                           script: options[:s],
+                                           show_methods: options[:m])
     scripts = script_list.execute
+    puts "script directory: #{script_directory}" if options[:d]; puts
     if scripts.empty?
       help_now! "No scripts available. You can create scripts with the edit command"
     else
@@ -194,11 +229,11 @@ command :count do |c|
   c.desc 'Key columns that are assigned the count of column values'
   c.arg_name 'COLUMN:TITLE,COLUMN:TITLE'
-  c.flag [:k, :key], :must_match => /^\d+:\w+(?:,\d+:\w+)*/
+  c.flag [:k, :key], :required => true, :must_match => /^\d+:\w+(?:,\d+:\w+)*/
   c.desc 'Rows to consider'
   c.arg_name '1,2,10-30,45-EOF,REGEXP'
-  c.flag [:r, :row], :must_match => /\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
+  c.flag [:r, :row], :must_match => row_regex #/\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
   c.desc 'Columns to count where columns 2 and 3 are counted conditionally'
   c.arg_name '1,2:<14.2.2014,10-30,3:>10'
@@ -212,11 +247,19 @@ command :count do |c|
   c.arg_name '%d.%m.%Y|%m/%d/%Y|...'
   c.flag [:df]
+  c.desc 'Sort headline values'
+  c.switch [:sort], :default_value => true
   c.action do |global_options,options,args|
     print "Counting..."
-    counter = Sycsvpro::Counter.new(infile: global_options[:f], outfile: global_options[:o],
-                                    key: options[:k], rows: options[:r], cols: options[:c],
-                                    df: options[:df], sum: options[:s])
+    counter = Sycsvpro::Counter.new(infile: global_options[:f],
+                                    outfile: global_options[:o],
+                                    key: options[:k],
+                                    rows: options[:r],
+                                    cols: options[:c],
+                                    df: options[:df],
+                                    sum: options[:s],
+                                    sort: options[:sort])
     counter.execute
     puts "done"
   end
@@ -229,7 +272,7 @@ command :aggregate do |c|
   c.desc 'Rows to consider'
   c.arg_name '1,2,10-30,45-EOF,REGEXP'
-  c.flag [:r, :row], :must_match => /\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
+  c.flag [:r, :row], :must_match => row_regex #/\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
   c.desc 'Columns to count'
   c.arg_name '1,2-4'
@@ -240,10 +283,18 @@ command :aggregate do |c|
   c.arg_name 'SUM_ROW_TITLE:ROW,SUM_COL_TITLE'
   c.flag [:s, :sum], :must_match => /^\w+:\d+(?:,\w+)?|^\w+/
+  c.desc 'Format of date values'
+  c.arg_name '%d.%m.%Y|%m/%d/%Y|...'
+  c.flag [:df]
   c.action do |global_options,options,args|
     print "Aggregating..."
-    aggregator = Sycsvpro::Aggregator.new(infile: global_options[:f], outfile: global_options[:o],
-                                          rows: options[:r], cols: options[:c], sum: options[:s])
+    aggregator = Sycsvpro::Aggregator.new(infile:  global_options[:f],
+                                          outfile: global_options[:o],
+                                          rows:    options[:r],
+                                          cols:    options[:c],
+                                          sum:     options[:s],
+                                          df:      options[:df])
     aggregator.execute
     puts "done"
   end
@@ -254,7 +305,7 @@ desc 'Sort rows based on column values'
 command :sort do |c|
   c.desc 'Rows to consider'
   c.arg_name '1,2,10-30,45-EOF,REGEXP'
-  c.flag [:r, :row], :must_match => /\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
+  c.flag [:r, :row], :must_match => row_regex #/\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
   c.desc 'Columns to sort based on a type (n = number, s = string, d = date) and its value'
   c.arg_name 'n:1,s:2-5,d:7'
@@ -310,18 +361,27 @@ arg_name 'MAPPINGS-FILE'
 command :map do |c|
   c.desc 'Rows to consider'
   c.arg_name 'ROW1,ROW2,ROW10-ROW30,45-EOF,REGEXP'
-  c.flag [:r, :row], :must_match => /\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
+  c.flag [:r, :row], :must_match => row_regex #/\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
   c.desc 'Columns to consider for mapping'
   c.arg_name 'COL1,COL2,COL10-COL30'
   c.flag [:c, :col], :must_match => /\d+(?:,\d+|-\d+)*/
+  c.desc 'Format of date values'
+  c.arg_name '%d.%m.%Y|%m/%d/%Y|...'
+  c.default_value '%Y-%m-%d'
+  c.flag [:df]
   c.action do |global_options,options,args|
     help_now! "You need to provide a mapping file" if args.size == 0
     print "Mapping..."
-    mapper = Sycsvpro::Mapper.new(infile: global_options[:f], outfile: global_options[:o],
-                                  mapping: args[0], rows: options[:r], cols: options[:c])
+    mapper = Sycsvpro::Mapper.new(infile:  global_options[:f],
+                                  outfile: global_options[:o],
+                                  mapping: args[0],
+                                  rows:    options[:r],
+                                  cols:    options[:c],
+                                  df:      options[:df])
     mapper.execute
     puts "done"
   end
@@ -336,9 +396,9 @@ command :calc do |c|
     default_value '*'
     c.flag [:h, :header], :must_match => /\*(?:,\w+)*/
-    c.desc 'Columns to consider for calculations'
+    c.desc 'Rows to consider for calculations'
     c.arg_name 'ROW1,ROW2-ROW10,45-EOF,REGEXP'
-    c.flag [:r, :row], :must_match => /\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
+    c.flag [:r, :row], :must_match => row_regex #/\d+(?:,\d+|-\d+|-eof|,\/.*\/)*|\/.*\/(?:,\/.*\/|\d+)*/i
     c.desc 'Column to do calculations on'
     c.arg_name 'COL1:*2,COL2:-C3,COL3:*2+(4+C5),COL6:NEW_COL=C1+5'
@@ -356,9 +416,13 @@ command :calc do |c|
     help_now! "You need to provide the column flag" if options[:c].nil?
     print "Calculating..."
-    calculator = Sycsvpro::Calculator.new(infile: global_options[:f], outfile: global_options[:o],
-                                          header: options[:h], rows: options[:r], cols: options[:c],
-                                          sum: options[:s], df: options[:df])
+    calculator = Sycsvpro::Calculator.new(infile:  global_options[:f],
+                                          outfile: global_options[:o],
+                                          header:  options[:h],
+                                          rows:    options[:r],
+                                          cols:    options[:c],
+                                          sum:     options[:s],
+                                          df:      options[:df])
     calculator.execute
     puts "done"
   end

data/lib/sycsvpro/aggregator.rb CHANGED Viewed

@@ -41,7 +41,7 @@ module Sycsvpro
       @infile     = options[:infile]
       @outfile    = options[:outfile]
       @headerless = options[:headerless] || false
-      @row_filter = RowFilter.new(options[:rows])
+      @row_filter = RowFilter.new(options[:rows], df: options[:df])
       @col_filter = ColumnFilter.new(options[:cols], df: options[:df])
       @key_values = Hash.new(0)
       @heading    = []

data/lib/sycsvpro/allocator.rb CHANGED Viewed

@@ -19,8 +19,8 @@ module Sycsvpro
     def initialize(options={})
       @infile     = options[:infile]
       @outfile    = options[:outfile]
-      @key_filter = ColumnFilter.new(options[:key])
-      @row_filter = RowFilter.new(options[:rows])
+      @key_filter = ColumnFilter.new(options[:key], df: options[:df])
+      @row_filter = RowFilter.new(options[:rows], df: options[:df])
       @col_filter = ColumnFilter.new(options[:cols])
     end

data/lib/sycsvpro/calculator.rb CHANGED Viewed

@@ -37,7 +37,7 @@ module Sycsvpro
       @infile      = options[:infile]
       @outfile     = options[:outfile]
       @date_format = options[:df] || "%Y-%m-%d"
-      @row_filter  = RowFilter.new(options[:rows])
+      @row_filter  = RowFilter.new(options[:rows], df: options[:df])
       @header      = Header.new(options[:header])
       @sum_row     = []
       @add_sum_row = options[:sum] || false

data/lib/sycsvpro/collector.rb CHANGED Viewed

@@ -20,7 +20,7 @@ module Sycsvpro
     def initialize(options={})
       @infile = options[:infile]
       @outfile = options[:outfile]
-      @row_filter = RowFilter.new(options[:rows])
+      @row_filter = RowFilter.new(options[:rows], df: options[:df])
       @collection = {}
       init_collection(options[:cols])
     end

data/lib/sycsvpro/counter.rb CHANGED Viewed

@@ -27,6 +27,8 @@ module Sycsvpro
     attr_reader :key_values
     # header of the out file
     attr_reader :heading
+    # indicates whether the headline values should be sorted
+    attr_reader :heading_sort
     # Title of the sum row
     attr_reader :sum_row_title
     # row where to add the sums of the columns
@@ -39,15 +41,16 @@ module Sycsvpro
     # Creates a new counter. Takes as attributes infile, outfile, key, rows, cols, date-format and
     # indicator whether to add a sum row
     def initialize(options={})
-      @infile     = options[:infile]
-      @outfile    = options[:outfile]
+      @infile       = options[:infile]
+      @outfile      = options[:outfile]
       init_key_columns(options[:key])
-      @row_filter = RowFilter.new(options[:rows])
-      @col_filter = ColumnFilter.new(options[:cols], df: options[:df])
-      @key_values = {}
-      @heading    = []
+      @row_filter   = RowFilter.new(options[:rows], df: options[:df])
+      @col_filter   = ColumnFilter.new(options[:cols], df: options[:df])
+      @key_values   = {}
+      @heading      = []
+      @heading_sort = options[:sort].nil? ? true : options[:sort]
       init_sum_scheme(options[:sum])
-      @sums       = Hash.new(0)
+      @sums         = Hash.new(0)
     end
     # Executes the counter
@@ -82,17 +85,18 @@ module Sycsvpro
    # Writes the count results
     def write_result
       sum_line = [sum_row_title] + [''] * (key_titles.size - 1)
-      heading.sort.each do |h|
+      headline = heading_sort ? heading.sort : col_filter.pivot.keys
+      headline.each do |h|
         sum_line << sums[h]
       end
       row = 0;
       File.open(outfile, 'w') do |out|
         out.puts sum_line.join(';') if row == sum_row ; row += 1
-        out.puts (key_titles + heading.sort).join(';')
+        out.puts (key_titles + headline).join(';')
         key_values.each do |k,v|
           out.puts sum_line.join(';') if row == sum_row ; row += 1
           line = [k]
-          heading.sort.each do |h|
+          headline.each do |h|
             line << v[:elements][h] unless h == sum_col_title
           end
           line << v[:sum] unless sum_col_title.nil?

data/lib/sycsvpro/extractor.rb CHANGED Viewed

@@ -20,8 +20,8 @@ module Sycsvpro
     def initialize(options={})
       @in_file  = options[:infile]
       @out_file = options[:outfile]
-      @row_filter = RowFilter.new(options[:rows])
-      @col_filter = ColumnFilter.new(options[:cols])
+      @row_filter = RowFilter.new(options[:rows], df: options[:df])
+      @col_filter = ColumnFilter.new(options[:cols], df: options[:df])
     end
     # Executes the extractor

data/lib/sycsvpro/filter.rb CHANGED Viewed

@@ -11,6 +11,8 @@ module Sycsvpro
     attr_reader :date_format
     # Filter for rows and columns
     attr_reader :filter
+    # Boolean for rows
+    attr_reader :boolean_filter
     # Type of column (n = number, s = string)
     attr_reader :types
     # Pattern that is used as a filter
@@ -25,21 +27,32 @@ module Sycsvpro
       @types   = []
       @pattern = []
       @pivot   = {}
+      @boolean_filter = ""
       create_filter(values)
     end
     # Creates the filters based on the given patterns
     def method_missing(id, *args, &block)
+      boolean_row_regex = %r{
+        BEGIN(\(*[nsd]\d+[<!=~>]{1,2}
+         (?:[A-Z][A-Za-z]*\.new\(.*?\)|\d+|['"].*?['"])
+         (?:\)*(?:&&|\|\||$)
+         \(*[nsd]\d+[<!=~>]{1,2}
+         (?:[A-Z][A-Za-z]*\.new\(.*?\)|\d+|['"].*?['"])\)*)*)END
+      }xi
+      return boolean_row($1, args, block)          if id =~ boolean_row_regex
       return equal($1, args, block)                if id =~ /^(\d+)$/
       return equal_type($1, $2, args, block)       if id =~ /^(s|n|d):(\d+)$/
       return range($1, $2, args, block)            if id =~ /^(\d+)-(\d+)$/
       return range_type($1, $2, $3, args, block)   if id =~ /^(s|n|d):(\d+)-(\d+)$/
       return regex($1, args, block)                if id =~ /^\/(.*)\/$/
       return col_regex($1, $2, args, block)        if id =~ /^(\d+):\/(.*)\/$/
-      return date($1, $2, $3, args, block)         if id =~ /^(\d+):(<|=|>)(\d+.\d+.\d+)/
+      return date($1, $2, $3, args, block)         if id =~ /^(\d+):(<|=|>)(\d+.\d+.\d+)$/
       return date_range($1, $2, $3, args, block)   if id =~ /^(\d+):(\d+.\d+.\d+.)-(\d+.\d+.\d+)$/
-      return number($1, $2, $3, args, block)       if id =~ /^(\d+):(<|=|>)(\d+)/
-      return number_range($1, $2, $3, args, block) if id =~ /^(\d):(\d+)-(\d+)/
+      return number($1, $2, $3, args, block)       if id =~ /^(\d+):(<|=|>)(\d+)$/
+      return number_range($1, $2, $3, args, block) if id =~ /^(\d):(\d+)-(\d+)$/
       super
     end
@@ -48,6 +61,44 @@ module Sycsvpro
       raise 'Needs to be overridden by sub class'
     end
+    # Checks whether the values match the boolean filter
+    def match_boolean_filter?(values=[])
+      return false if boolean_filter.empty? or values.empty?
+      expression = boolean_filter
+      columns = expression.scan(/(([nsd])(\d+))([<!=~>]{1,2})(.*?)(?:[\|&]{2}|$)/)
+#      STDERR.puts "expr = #{expression.inspect}"
+#      STDERR.puts "vals = #{values.inspect}"
+#      STDERR.puts "cols = #{columns.inspect}"
+      columns.each do |c|
+#        STDERR.puts "val = #{values[c[2].to_i].inspect}"
+        value = case c[1]
+        when 'n'
+          values[c[2].to_i].empty? ? '0' : values[c[2].to_i]
+        when 's'
+          "'#{values[c[2].to_i]}'"
+        when 'd'
+          begin
+            Date.strptime(values[c[2].to_i], date_format)
+          rescue Exception => e
+            case c[3]
+            when '<', '<=', '=='
+              "#{c[4]}+1"
+            when '>', '>='
+              '0'
+            when '!='
+              c[4]
+            end
+          else
+            "Date.strptime('#{values[c[2].to_i]}', '#{date_format}')"
+          end
+        end
+        expression = expression.gsub(c[0], value)
+#        STDERR.puts "val2 = #{value}"
+      end
+#      STDERR.puts "exp = #{expression.inspect}"
+      eval(expression)
+    end
     # Yields the column value and whether the filter matches the column
     def pivot_each_column(values=[])
       pivot.each do |column, parameters|
@@ -65,14 +116,17 @@ module Sycsvpro
     # Checks whether a filter has been set. Returns true if filter has been set otherwise false
     def has_filter?
-      return !(filter.empty? and pattern.empty?)
+      return !(filter.empty? and pattern.empty? and boolean_filter.empty?)
     end
     private
-      # Creates a filter based on the provided rows and columns
+      # Creates a filter based on the provided rows and columns select criteria
       def create_filter(values)
-        values.split(',').each { |f| send(f) } unless values.nil?
+        values.scan(/(?<=,|^)(BEGIN.*?END|\/.*?\/|.*?)(?=,|$)/i).flatten.each do |value|
+#          STDERR.puts "value = #{value}"
+          send(value)
+        end unless values.nil?
       end
       # Adds a single value to the filter
@@ -110,6 +164,11 @@ module Sycsvpro
         pivot[r] = { col: col, operation: operation }
       end
+      # Adds a boolean row filter
+      def boolean_row(operation, args, block)
+        boolean_filter.clear << operation
+      end
       # Adds a date filter
       def date(col, comparator, date, args, block)
         comparator = '==' if comparator == '='

data/lib/sycsvpro/mapper.rb CHANGED Viewed

@@ -19,8 +19,8 @@ module Sycsvpro
     def initialize(options={})
       @infile = options[:infile]
       @outfile = options[:outfile]
-      @row_filter = RowFilter.new(options[:row_filter])
-      @col_filter = ColumnFilter.new(options[:col_filter])
+      @row_filter = RowFilter.new(options[:row_filter], df: options[:df])
+      @col_filter = ColumnFilter.new(options[:col_filter], df: options[:df])
       @mapper = {}
       init_mapper(options[:mapping])
     end

data/lib/sycsvpro/row_filter.rb CHANGED Viewed

@@ -17,9 +17,10 @@ module Sycsvpro
       pattern.each do |p|
         filtered = (filtered or !(object =~ Regexp.new(p)).nil?)
       end
+      filtered = (filtered or match_boolean_filter?(object.split(';')))
       filtered ? object : nil
     end
   end
 end

data/lib/sycsvpro/script_list.rb CHANGED Viewed

@@ -21,6 +21,7 @@ module Sycsvpro
       @script_type.downcase!
       @script_file  = options[:script] || '*.rb'  if @script_type == 'script'
       @script_file  = options[:script] || '*.ins' if @script_type == 'insert'
+      @script_file  = options[:script] || '*.{rb,ins}' if @script_type == 'all'
       @show_methods = options[:show_methods] if @script_type == 'script'
       @show_methods = false if @script_type == 'insert'
       @list         = {}

data/lib/sycsvpro/sorter.rb CHANGED Viewed

@@ -32,7 +32,7 @@ module Sycsvpro
       @outfile         = options[:outfile]
       @headerless      = options[:headerless] || false
       @desc            = options[:desc] || false
-      @row_filter      = RowFilter.new(options[:rows])
+      @row_filter      = RowFilter.new(options[:rows], df: options[:df])
       @col_type_filter = ColumnTypeFilter.new(options[:cols], df: options[:df])
       @sorted_rows     = []
     end

data/lib/sycsvpro/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # Operating csv files
 module Sycsvpro
   # Version number of sycsvpro
-  VERSION = '0.1.1'
+  VERSION = '0.1.2'
 end

data/spec/sycsvpro/counter_spec.rb CHANGED Viewed

@@ -33,15 +33,15 @@ module Sycsvpro
     it "should count date columns" do
       counter = Counter.new(infile: @in_file, outfile: @out_file, rows: "1-10",
                             cols: "2:<1.1.2013,2:1.1.2013-31.12.2014,2:>31.12.2014",
-                            key: "0:customer", df: "%d.%m.%Y")
+                            key: "0:customer", df: "%d.%m.%Y", sort: false)
       counter.execute
-      result = [ "customer;1.1.2013-31.12.2014;<1.1.2013;>31.12.2014",
+      result = [ "customer;<1.1.2013;1.1.2013-31.12.2014;>31.12.2014",
                  "Fink;0;0;2",
-                 "Haas;0;1;0",
-                 "Gent;1;0;0",
-                 "Rank;1;0;0" ]
+                 "Haas;1;0;0",
+                 "Gent;0;1;0",
+                 "Rank;0;1;0" ]
       File.open(@out_file).each_with_index do |line, index|
         line.chomp.should eq result[index]

data/spec/sycsvpro/extractor_spec.rb CHANGED Viewed

@@ -5,7 +5,8 @@ module Sycsvpro
   describe Extractor do
     before do
-      @in_file = File.join(File.dirname(__FILE__), "files/in.csv")
+      @in_file  = File.join(File.dirname(__FILE__), "files/in.csv")
+      @in_file2 = File.join(File.dirname(__FILE__), "files/in4.csv")
       @out_file = File.join(File.dirname(__FILE__), "files/out.csv")
     end
@@ -37,6 +38,20 @@ module Sycsvpro
     end
+    it "should extract rows base on regex including commas" do
+      extractor = Extractor.new(infile: @in_file2, outfile: @out_file, rows: "/[56789]\\d+|\\d{3,}/")
+      extractor.execute
+      result = [ "Gent;50",
+                 "Haas;100",
+                 "Klig;80" ]
+      File.open(@out_file).each_with_index do |line, index|
+        line.chomp.should eq result[index]
+      end
+    end
   end
 end

data/spec/sycsvpro/row_filter_spec.rb ADDED Viewed

@@ -0,0 +1,76 @@
+require 'sycsvpro/row_filter'
+module Sycsvpro
+  describe RowFilter do
+    before do
+      @in_file = File.join(File.dirname(__FILE__), "files/in.csv")
+      @out_file = File.join(File.dirname(__FILE__), "files/out.csv")
+    end
+    it "should return row string when no filter is set" do
+      row_filter = Sycsvpro::RowFilter.new(nil)
+      row_filter.process("abc", row: 1).should eq "abc"
+    end
+    it "should filter rows on index" do
+      rows = "1-5"
+      row_filter = Sycsvpro::RowFilter.new(rows)
+      row_filter.process("abc", row: 1).should eq "abc"
+      row_filter.process("abc", row: 6).should be_nil
+    end
+    it "should filter rows on regex" do
+      rows = "1,\/\\d{2,}\/"
+      row_filter = Sycsvpro::RowFilter.new(rows)
+      row_filter.process("5;50;500", row: 1).should eq "5;50;500"
+      row_filter.process("5;50;500", row: 2).should eq "5;50;500"
+    end
+    it "should filter rows on logical expression" do
+      rows = "BEGINn1>50&&s2=='Ruby'||n3<10END"
+      row_filter = Sycsvpro::RowFilter.new(rows)
+      row_filter.process("a;49;Rub;9").should eq "a;49;Rub;9"
+      row_filter.process("a;51;Ruby;11").should eq "a;51;Ruby;11"
+      row_filter.process("a;49;Ruby;11").should be_nil
+    end
+    it "should filter rows on Ruby classes" do
+      rows = "BEGINn1==50&&d2==Date.new(2014,6,16)||s3=~Regexp.new('[56789]\\d{2,}')END"
+      row_filter = Sycsvpro::RowFilter.new(rows)
+      row_filter.process("x;50;2014-06-16;99").should eq "x;50;2014-06-16;99"
+    end
+    it "should filter rows on row number filter and boolean filter" do
+      rows = "1,3-4,BEGINn1==50&&d2<Date.new(2014,6,16)||s3=='Works?'END"
+      row_filter = Sycsvpro::RowFilter.new(rows)
+      row_filter.process("x;50;2014-06-15;Works?").should eq "x;50;2014-06-15;Works?"
+      row_filter.process("x;50;2014-06-15;Works?", row: 1).should eq "x;50;2014-06-15;Works?"
+    end
+    it "should filter rows on boolean filter with brackets" do
+      rows = "BEGINn1==50&&(d2<Date.new(2014,6,16)||s3=='Works?')END"
+      row_filter = Sycsvpro::RowFilter.new(rows)
+      row_filter.process("x;50;2014-6-15;Works?").should eq "x;50;2014-6-15;Works?"
+      row_filter.process("x;49;2014-6-15;Works?").should be_nil
+      row_filter.process("x;50;2014-6-17;Worx?").should be_nil
+    end
+    it "should fitler rows with ' in value" do
+      rows = "BEGINn1!=50||n2=~'/\\d+/'||n2==\"Doesn't work\"END"
+      row_filter = Sycsvpro::RowFilter.new(rows)
+      row_filter.process("x;50;2;we").should be_nil
+      row_filter.process("x;49;/\\d+/;\"Doesn't work\"").should eq "x;49;/\\d+/;Doesn't work"
+    end
+    it "should not filter rows with invalid syntax" do
+      rows = "BEGINn1!=50||n2=~regex('\\d+')END"
+      expect { Sycsvpro::RowFilter.new(rows) }.to raise_error
+    end
+  end
+end

metadata CHANGED Viewed

@@ -1,74 +1,84 @@
 --- !ruby/object:Gem::Specification
 name: sycsvpro
 version: !ruby/object:Gem::Version
-  version: 0.1.1
+  version: 0.1.2
+  prerelease:
 platform: ruby
 authors:
 - Pierre Sugar
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2014-03-12 00:00:00.000000000 Z
+date: 2014-06-17 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: rake
   requirement: !ruby/object:Gem::Requirement
+    none: false
     requirements:
-    - - ">="
+    - - ! '>='
       - !ruby/object:Gem::Version
         version: '0'
   type: :development
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
+    none: false
     requirements:
-    - - ">="
+    - - ! '>='
       - !ruby/object:Gem::Version
         version: '0'
 - !ruby/object:Gem::Dependency
   name: rdoc
   requirement: !ruby/object:Gem::Requirement
+    none: false
     requirements:
-    - - ">="
+    - - ! '>='
       - !ruby/object:Gem::Version
         version: '0'
   type: :development
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
+    none: false
     requirements:
-    - - ">="
+    - - ! '>='
       - !ruby/object:Gem::Version
         version: '0'
 - !ruby/object:Gem::Dependency
   name: aruba
   requirement: !ruby/object:Gem::Requirement
+    none: false
     requirements:
-    - - ">="
+    - - ! '>='
       - !ruby/object:Gem::Version
         version: '0'
   type: :development
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
+    none: false
     requirements:
-    - - ">="
+    - - ! '>='
       - !ruby/object:Gem::Version
         version: '0'
 - !ruby/object:Gem::Dependency
   name: rspec
   requirement: !ruby/object:Gem::Requirement
+    none: false
     requirements:
-    - - ">="
+    - - ! '>='
       - !ruby/object:Gem::Version
         version: '0'
   type: :development
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
+    none: false
     requirements:
-    - - ">="
+    - - ! '>='
       - !ruby/object:Gem::Version
         version: '0'
 - !ruby/object:Gem::Dependency
   name: gli
   requirement: !ruby/object:Gem::Requirement
+    none: false
     requirements:
     - - '='
       - !ruby/object:Gem::Version
@@ -76,6 +86,7 @@ dependencies:
   type: :runtime
   prerelease: false
   version_requirements: !ruby/object:Gem::Requirement
+    none: false
     requirements:
     - - '='
       - !ruby/object:Gem::Version
@@ -89,8 +100,8 @@ extra_rdoc_files:
 - README.rdoc
 - sycsvpro.rdoc
 files:
-- ".gitignore"
-- ".rspec"
+- .gitignore
+- .rspec
 - Gemfile
 - Gemfile.lock
 - LICENSE
@@ -201,37 +212,39 @@ files:
 - spec/sycsvpro/inserter_spec.rb
 - spec/sycsvpro/mapper_spec.rb
 - spec/sycsvpro/profiler_spec.rb
+- spec/sycsvpro/row_filter_spec.rb
 - spec/sycsvpro/script_list_spec.rb
 - spec/sycsvpro/sorter_spec.rb
 - sycsvpro.gemspec
 - sycsvpro.rdoc
 homepage: https://github.com/sugaryourcoffee/syc-svpro
 licenses: []
-metadata: {}
 post_install_message:
 rdoc_options:
-- "--title"
+- --title
 - sycsvpro
-- "--main"
+- --main
 - README.rdoc
-- "-ri"
+- -ri
 require_paths:
 - lib
 - lib
 required_ruby_version: !ruby/object:Gem::Requirement
+  none: false
   requirements:
-  - - ">="
+  - - ! '>='
     - !ruby/object:Gem::Version
       version: '0'
 required_rubygems_version: !ruby/object:Gem::Requirement
+  none: false
   requirements:
-  - - ">="
+  - - ! '>='
     - !ruby/object:Gem::Version
       version: '0'
 requirements: []
 rubyforge_project:
-rubygems_version: 2.2.0
+rubygems_version: 1.8.23
 signing_key:
-specification_version: 4
+specification_version: 3
 summary: Processing of csv files
 test_files: []

checksums.yaml DELETED Viewed

@@ -1,7 +0,0 @@
----
-SHA1:
-  metadata.gz: 7823aeea07dda43deb692fdf13a8f57559bbb917
-  data.tar.gz: 4e3da29ada80c6c4a5eb282facce537e4750f0e3
-SHA512:
-  metadata.gz: 203400b5c9187269c3fe336f940f25427eecfd5fe25a56fbc3fc982f7c47d059a6aefc5eb74b5ab5769d5a0b318ebf2ee72a231db64127cb24ca767ee8f5915f
-  data.tar.gz: deb1cfff428e6e83f1770a2f67a7b4e6fcf5cc642ce6c96cd50ba748ecfb8e44aaa39b53b88bc08083e434857d2ab778a7d086f1e61d9de3b3d7c1568a9ce343