sycsvpro 0.1.12 → 0.1.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- sycsvpro (0.1.12)
4
+ sycsvpro (0.1.13)
5
5
  gli (= 2.9.0)
6
6
  timeleap (~> 0.0.1)
7
7
 
data/README.md CHANGED
@@ -7,6 +7,7 @@ Processing of csv files. *sycsvpro* offers following functions
7
7
  * extract rows and columns from a file
8
8
  * remove duplicate lines from a file where duplicates are identified by key
9
9
  columns (since version 0.1.11)
10
+ add unique to command line interface (since version 0.1.12)
10
11
  * collect values of rows and assign them to categories
11
12
  * map column values to new values
12
13
  * allocate column values to a key column (since version 0.0.4)
@@ -22,6 +23,7 @@ Processing of csv files. *sycsvpro* offers following functions
22
23
  version 0.1.4)
23
24
  * join two file based on a joint column value (since version 0.1.7)
24
25
  * merge files based on common headline columns (since version 0.1.10)
26
+ * transpose (swapping) rows and columns (since version 0.1.13)
25
27
 
26
28
  To get help type
27
29
 
@@ -108,7 +110,7 @@ Collect all product rows (2, 3 and 4) to the category product
108
110
 
109
111
  Map
110
112
  ---
111
- Map the product names to new names
113
+ Map the product names to new names. Consider columns 2-4 only for mapping
112
114
 
113
115
  The mapping file (mapping) uses the result from the collect command above
114
116
 
@@ -127,6 +129,35 @@ The mapping file (mapping) uses the result from the collect command above
127
129
 
128
130
  $ sycsvpro -f in.csv -o out.csv map mapping -c 2-4
129
131
 
132
+ Transpose
133
+ ---------
134
+ Swap rows and columns of revenue.csv to out.csv
135
+
136
+ $ sycsvpro -f revenue.csv -o out.csv transpose
137
+
138
+ 2010;50;100;2000
139
+ 2011;100;50;250
140
+ 2012;150;10;300
141
+ 2013;100;1000;3000
142
+ 2014;200;20;20
143
+ customer;hello;indix;chiro
144
+
145
+ To use only columns 2013 and 2014 you can specify a the columns to transpose
146
+
147
+ $ sycsvpro -f revenue.csv -o out.csv transpose -c 3-5
148
+
149
+ 2013;100;1000;3000
150
+ 2014;200;20;20
151
+ customer;hello;indix;chiro
152
+
153
+ To filter for hello only
154
+
155
+ $ sycsvpor -f revenue.csv -o out.csv transpose -c 3-5 -r 0,1
156
+
157
+ 2013;100
158
+ 2014;200
159
+ customer;hello
160
+
130
161
  Allocate
131
162
  --------
132
163
  Allocate all the machine types to the customer
@@ -196,7 +227,7 @@ Process arithmetic operations on the contract count and create a target column
196
227
  and a sum which is added at the end of the result file
197
228
 
198
229
  $ sycsvpro -f in.csv -o out.csv calc -r 2-20 -h *,target
199
- -c 6:*2,7:target=c6*10
230
+ -c 6:*2,7:c6*10
200
231
 
201
232
  $ cat out.csv
202
233
  customer;machine;control;drive;motor;date;contract;target
@@ -210,6 +241,20 @@ and a sum which is added at the end of the result file
210
241
  In the sum row non-numbers in the colums are converted to 0. Therefore column 0
211
242
  is summed up to 0 as all strings are converted to 0.
212
243
 
244
+ Write only columns 0, 6 and 7 by specifying write columns
245
+
246
+ $ sycsvpro -f in.csv -o out.csv calc -r 2-20 -h "customer,contract,target"
247
+ -c 6:*2,7:c6*10
248
+ -w 0,6-7
249
+ $ cat out.csv
250
+ customer;contract;target
251
+ hello;2;20
252
+ hello;2;20
253
+ indix;2;20
254
+ chiro;2;20
255
+ chiro;2;20
256
+ 0;10;100
257
+
213
258
  Join
214
259
  ----
215
260
  Join the machine and contract file with columns from the customer address file
@@ -250,6 +295,7 @@ Merge files machine_count.csv and revenue.csv based on the year columns.
250
295
  This will create the out.csv
251
296
 
252
297
  ```
298
+ $ cat out.csv
253
299
  ;2010;2013;2014
254
300
  hello;1;0;0
255
301
  indix;1;0;0
@@ -266,6 +312,7 @@ Sort rows on specified columns as an example sort rows based on customer
266
312
 
267
313
  $ sycsvpro -f in.csv -o out.csv sort -r 2-20 -c s:0,d:5
268
314
 
315
+ $cat out.csv
269
316
  customer;machine;control;drive;motor;date;contract;target
270
317
  hello;h2;con123;dri130;mot110;1.02.3012;1
271
318
  hello;h1;con123;dri120;mot100;1.01.3013;1
@@ -406,8 +453,8 @@ row are added on top of the sorted file
406
453
  * `sycsvpro -f infile analyze` now lists the columns with sample data
407
454
  * Add `params` method to *Dsl* that retrieves the params provided in the execute
408
455
  command: `sycsvpro execute script.rb method infile param1 param2`
409
- * Add `clean_up` to *Dsl* that takes files to be deleted after the script has
410
- run: `clean_up(%w{file1 file2})`
456
+ * Add `clean\_up` to *Dsl* that takes files to be deleted after the script has
457
+ run: `clean\_up(%w{file1 file2})`
411
458
 
412
459
  Version 0.1.4
413
460
  -------------
@@ -465,7 +512,7 @@ Version 0.1.7
465
512
  This will join infile.csv with source.csv based on the join columns (j "1=3").
466
513
  From source.csv columns 2 and 4 (-c "2,4") will be inserted at column
467
514
  positions 1 and 3 (-p "1,3"). The header will be used from the infile.csv
468
- (-h "*") supplemented by the columns A and B (-i "A,B") that will also be
515
+ (-h "\*") supplemented by the columns A and B (-i "A,B") that will also be
469
516
  positioned at column 1 and 3 (-p "1,3").
470
517
 
471
518
  Version 0.1.8
@@ -474,8 +521,9 @@ Version 0.1.8
474
521
 
475
522
  Version 0.1.9
476
523
  -------------
477
- * When creating columns dynamically they are in arbitrary sequence. You can now
478
- provide a switch `sort: "2"` which will sort the header from column 2 on.
524
+ * When creating columns dynamically in count they are in arbitrary sequence.
525
+ You can now provide a switch `sort: "2"` which will sort the header from
526
+ column 2 on.
479
527
 
480
528
  Version 0.1.10
481
529
  --------------
@@ -488,6 +536,27 @@ Version 0.1.11
488
536
  * Unique removes duplicate lines from the infile. Duplicate lines are identified
489
537
  by key columns
490
538
 
539
+ Version 0.1.12
540
+ --------------
541
+ * Add unique to sycsvpro command line interface
542
+
543
+ Version 0.1.13
544
+ --------------
545
+ * Optimize Mapper by only considering columns provided for mapping which should
546
+ increase performance
547
+ * match\_boolean\_filter? in Filter now also processes strings with single
548
+ quotes inside
549
+ * Tranposer tranposes rows and columns that is make columns rows and vice versa
550
+ * Calculator can now have colons inside the operation
551
+ sycsvpro -f in.csv -o out.csv -c "122:+[1,3,5].inject(:+)"
552
+ Previously the operation would have been cut after inject(
553
+ * A write flag in Calculator specifies which colons to add to the result.
554
+ * Calculator introduced a switch 'final\_header' which indicates the header
555
+ provided should not be filtered in regard to a provided 'write' flag but
556
+ written to the result file as is
557
+ * Merger now doesn't require a key column that is files can be merged without
558
+ key columns.
559
+
491
560
  Installation
492
561
  ============
493
562
  [![Gem Version](https://badge.fury.io/rb/sycsvpro.png)](http://badge.fury.io/rb/sycsvpro)
data/bin/sycsvpro CHANGED
@@ -589,6 +589,27 @@ command :map do |c|
589
589
  end
590
590
  end
591
591
 
592
+ desc 'Transposes rows and columns'
593
+ command :transpose do |c|
594
+ c.desc 'Rows to consider'
595
+ c.arg_name 'ROW1,ROW2,ROW10-ROW30,45-EOF,REGEXP'
596
+ c.flag [:r, :row], :must_match => row_regex
597
+
598
+ c.desc 'Columns to consider for mapping'
599
+ c.arg_name 'COL1,COL2,COL10-COL30'
600
+ c.flag [:c, :col], :must_match => /\d+(?:,\d+|-\d+)*/
601
+
602
+ c.action do |global_options,options,args|
603
+ print "Transpose..."
604
+ transpose = Sycsvpro::Transposer.new(infile: global_options[:f],
605
+ outfile: global_options[:o],
606
+ rows: options[:r],
607
+ cols: options[:c])
608
+ transpose.execute
609
+ puts "done"
610
+ end
611
+ end
612
+
592
613
  desc 'Process operations on columns. Optionally add a sum row for columns with'+
593
614
  'number values'
594
615
  command :calc do |c|
@@ -600,6 +621,11 @@ command :calc do |c|
600
621
  default_value '*'
601
622
  c.flag [:h, :header], :must_match => /^[*|\w ]+(?:,[\w ]+)*/
602
623
 
624
+ c.desc 'Indicates whether the provided header is final. That is if columns'+
625
+ ' to be written to the outfile are selected by the write flag then '+
626
+ 'the header should left untouched and written as is'
627
+ c.switch [:f, :final], :default_value => false
628
+
603
629
  c.desc 'Rows to consider for calculations'
604
630
  c.arg_name 'ROW1,ROW2-ROW10,45-EOF,REGEXP'
605
631
  c.flag [:r, :row], :must_match => row_regex
@@ -610,6 +636,10 @@ command :calc do |c|
610
636
  c.arg_name "COL1:*2,COL2:-C3,COL3:*2+(4+C5)"
611
637
  c.flag [:c, :col], :must_match => /^\d+:.+/
612
638
 
639
+ c.desc 'Columns to be written to the result file'
640
+ c.arg_name "COL1,COL2-COL5"
641
+ c.flag [:w, :write], :must_match => /\d+(?:,\d+|-\d+)*/
642
+
613
643
  c.desc 'Date format of date columns'
614
644
  c.arg_name '%d.%m.%Y|%Y-%m-%d|...'
615
645
  c.flag [:df]
@@ -622,13 +652,15 @@ command :calc do |c|
622
652
  help_now! "You need to provide the column flag" if options[:c].nil?
623
653
 
624
654
  print "Calculating..."
625
- calculator = Sycsvpro::Calculator.new(infile: global_options[:f],
626
- outfile: global_options[:o],
627
- header: options[:h],
628
- rows: options[:r],
629
- cols: options[:c],
630
- sum: options[:s],
631
- df: options[:df])
655
+ calculator = Sycsvpro::Calculator.new(infile: global_options[:f],
656
+ outfile: global_options[:o],
657
+ header: options[:h],
658
+ final_header: options[:f],
659
+ rows: options[:r],
660
+ cols: options[:c],
661
+ write: options[:w],
662
+ sum: options[:s],
663
+ df: options[:df])
632
664
  calculator.execute
633
665
  puts "done"
634
666
  end
@@ -58,8 +58,13 @@ module Sycsvpro
58
58
  attr_reader :formulae
59
59
  # header of the outfile
60
60
  attr_reader :header
61
+ # indicates whether this header is final and should not be filtered in
62
+ # respect to the columns defined by write
63
+ attr_reader :final_header
61
64
  # filter that is used for columns
62
65
  attr_reader :columns
66
+ # selected columns to be written to outfile
67
+ attr_reader :write
63
68
  # if true add a sum row at the bottom of the out file
64
69
  attr_reader :add_sum_row
65
70
 
@@ -67,29 +72,36 @@ module Sycsvpro
67
72
  # can be supplemented with additional column names that are generated due
68
73
  # to an arithmetic operation that creates new columns
69
74
  # :call-seq:
70
- # Sycsvpro::Calculator.new(infile: "in.csv",
71
- # outfile: "out.csv",
72
- # df: "%d.%m.%Y",
73
- # rows: "1,2,BEGINn3>20END",
74
- # header: "*,Count",
75
- # cols: "4:c1+c2*2",
76
- # sum: true).execute
75
+ # Sycsvpro::Calculator.new(infile: "in.csv",
76
+ # outfile: "out.csv",
77
+ # df: "%d.%m.%Y",
78
+ # rows: "1,2,BEGINn3>20END",
79
+ # header: "*,Count",
80
+ # final_header: false,
81
+ # cols: "4:c1+c2*2",
82
+ # write: "1,3-5",
83
+ # sum: true).execute
77
84
  # infile:: File that contains the rows to be operated on
78
85
  # outfile:: Result of the operations
79
86
  # df:: Date format
80
87
  # rows:: Row filter that indicates which rows to consider
81
88
  # header:: Header of the columns
89
+ # final_header:: Indicates that if write filters columns the header should
90
+ # not be filtered when written
82
91
  # cols:: Operations on the column values
92
+ # write:: Columns that are written to the outfile
83
93
  # sum:: Indicate whether to add a sum row
84
94
  def initialize(options={})
85
- @infile = options[:infile]
86
- @outfile = options[:outfile]
87
- @date_format = options[:df] || "%Y-%m-%d"
88
- @row_filter = RowFilter.new(options[:rows], df: options[:df])
89
- @header = Header.new(options[:header])
90
- @sum_row = []
91
- @add_sum_row = options[:sum] || false
92
- @formulae = {}
95
+ @infile = options[:infile]
96
+ @outfile = options[:outfile]
97
+ @date_format = options[:df] || "%Y-%m-%d"
98
+ @row_filter = RowFilter.new(options[:rows], df: options[:df])
99
+ @write_filter = ColumnFilter.new(options[:write], df: options[:df])
100
+ @header = Header.new(options[:header])
101
+ @final_header = options[:final_header]
102
+ @sum_row = []
103
+ @add_sum_row = options[:sum]
104
+ @formulae = {}
93
105
  create_calculator(options[:cols])
94
106
  end
95
107
 
@@ -112,7 +124,8 @@ module Sycsvpro
112
124
 
113
125
  unless processed_header
114
126
  header_row = header.process(line.chomp)
115
- out.puts header_row unless header_row.empty?
127
+ header_row = @write_filter.process(header_row) unless @final_header
128
+ out.puts header_row unless header_row.nil? or header_row.empty?
116
129
  processed_header = true
117
130
  next
118
131
  end
@@ -123,7 +136,7 @@ module Sycsvpro
123
136
  formulae.each do |col, formula|
124
137
  @columns[col.to_i] = eval(formula)
125
138
  end
126
- out.puts @columns.join(';')
139
+ out.puts @write_filter.process(@columns.join(';'))
127
140
 
128
141
  @columns.each_with_index do |column, index|
129
142
  column = 0 unless column.to_s =~ /^[\d\.,]*$/
@@ -137,7 +150,7 @@ module Sycsvpro
137
150
 
138
151
  end
139
152
 
140
- out.puts @sum_row.join(';') if add_sum_row
153
+ out.puts @write_filter.process(@sum_row.join(';')) if add_sum_row
141
154
 
142
155
  end
143
156
  end
@@ -154,7 +167,7 @@ module Sycsvpro
154
167
  # column 1 + 1 c[4] = c[1] + 1
155
168
  def create_calculator(code)
156
169
  code.split(/,(?=\d+:)/).each do |operation|
157
- col, term = operation.split(':')
170
+ col, term = operation.split(':', 2)
158
171
  term = "c#{col}#{term}" if term =~ /^[+\-*\/%]/
159
172
  formulae[col] = term
160
173
  end
data/lib/sycsvpro/dsl.rb CHANGED
@@ -76,8 +76,9 @@ module Dsl
76
76
  end
77
77
  end
78
78
 
79
- # Remove leading and trailing " and spaces as well as reducing more than 2 spaces between words
80
- # from csv values. Replac ; with , from values as ; is used as value separator
79
+ # Remove leading and trailing " and spaces as well as reducing more than 2
80
+ # spaces between words from csv values. Replace ; with , from values as ;
81
+ # is used as value separator
81
82
  def unstring(line)
82
83
  line = str2utf8(line)
83
84
  line.scan(/(?<=^"|;")[^"]+(?=;)+[^"]*|;+[^"](?=";|"$)/).each do |value|
@@ -71,7 +71,7 @@ module Sycsvpro
71
71
  when 'n'
72
72
  values[c[2].to_i].empty? ? '0' : values[c[2].to_i]
73
73
  when 's'
74
- "'#{values[c[2].to_i]}'"
74
+ "\"#{values[c[2].to_i]}\""
75
75
  when 'd'
76
76
  begin
77
77
  Date.strptime(values[c[2].to_i], date_format)
@@ -2,8 +2,33 @@
2
2
  module Sycsvpro
3
3
 
4
4
  # Map values to new values described in a mapping file
5
+ #
6
+ # in.csv
7
+ #
8
+ # | ID | Name |
9
+ # | --- | ---- |
10
+ # | 1 | Hank |
11
+ # | 2 | Jane |
12
+ #
13
+ # mapping
14
+ #
15
+ # 1:01
16
+ # 2:02
17
+ #
18
+ # Sycsvpro::Mapping.new(infile: "in.csv",
19
+ # outfile: "out.csv",
20
+ # mapping: "mapping",
21
+ # cols: "0").execute
22
+ # out.csv
23
+ #
24
+ # | ID | Name |
25
+ # | --- | ---- |
26
+ # | 01 | Hank |
27
+ # | 02 | Jane |
5
28
  class Mapper
6
29
 
30
+ include Dsl
31
+
7
32
  # infile contains the data that is operated on
8
33
  attr_reader :infile
9
34
  # outfile is the file where the result is written to
@@ -12,15 +37,29 @@ module Sycsvpro
12
37
  attr_reader :mapper
13
38
  # filter that is used for rows
14
39
  attr_reader :row_filter
15
- # filter that is used for columns
40
+ # filter that contains columns that are considered for mappings
16
41
  attr_reader :col_filter
17
42
 
18
43
  # Creates new mapper
44
+ # :call-seq:
45
+ # Sycsvpro::Mapper.new(infile: "in.csv",
46
+ # outfile: "out.csv",
47
+ # mapping: "mapping.csv",
48
+ # rows: "1,3-5",
49
+ # cols: "3,4-7"
50
+ # df: "%Y-%m-%d").execute
51
+ #
52
+ # infile:: File that contains columns to be mapped
53
+ # outfile:: File that contains the mapping result after execute
54
+ # mapping:: File that contains the mappings. Mappings are separated by ':'
55
+ # rows:: Rows to consider for mappings
56
+ # cols:: Columns that should be mapped
57
+ # df:: Date format for row filter if rows are filtered on date values
19
58
  def initialize(options={})
20
59
  @infile = options[:infile]
21
60
  @outfile = options[:outfile]
22
- @row_filter = RowFilter.new(options[:row_filter], df: options[:df])
23
- @col_filter = ColumnFilter.new(options[:col_filter], df: options[:df])
61
+ @row_filter = RowFilter.new(options[:rows], df: options[:df])
62
+ @col_filter = init_col_filter(options[:cols], @infile)
24
63
  @mapper = {}
25
64
  init_mapper(options[:mapping])
26
65
  end
@@ -29,25 +68,49 @@ module Sycsvpro
29
68
  def execute
30
69
  File.open(outfile, 'w') do |out|
31
70
  File.new(infile, 'r').each_with_index do |line, index|
32
- result = col_filter.process(row_filter.process(line, row: index))
71
+ result = row_filter.process(line, row: index)
33
72
  next if result.chomp.empty? or result.nil?
34
- mapper.each do |from, to|
35
- result = result.chomp.gsub(/(?<=^|;)#{from}(?=;|$)/, to)
73
+ result += ' ' if result =~ /;$/
74
+ cols = result.split(';')
75
+ @col_filter.each do |key|
76
+ substitute = mapper[cols[key]]
77
+ cols[key] = substitute if substitute
36
78
  end
37
- out.puts result
79
+ out.puts cols.join(';').strip
38
80
  end
39
81
  end
40
82
  end
41
83
 
42
84
  private
43
85
 
44
- # Initializes the mappings
86
+ # Initializes the mappings. A mapping consists of the value to be mapped
87
+ # to another value. The values are spearated by colons ':'
88
+ # Example:
89
+ # source_value:mapping_value
45
90
  def init_mapper(file)
46
91
  File.new(file, 'r').each_line do |line|
47
- from, to = line.chomp.split(':')
92
+ from, to = unstring(line).split(':')
48
93
  mapper[from] = to
49
94
  end
50
95
  end
96
+
97
+ # Initialize the col_filter that contains columns to be considered for
98
+ # mapping. If no columns are provided, that is being empty, a filter with
99
+ # all columns is returned
100
+ def init_col_filter(columns, source)
101
+ if columns.nil?
102
+ File.open(source, 'r').each do |line|
103
+ line = unstring(line)
104
+ next if line.empty?
105
+ line += ' ' if line =~ /;$/
106
+ size = line.split(';').size
107
+ columns = "0-#{size-1}"
108
+ break
109
+ end
110
+ end
111
+ ColumnFilter.new(columns).filter.flatten
112
+ end
113
+
51
114
  end
52
115
 
53
116
  end
@@ -69,21 +69,25 @@ module Sycsvpro
69
69
  # source_header:: pattern for each header of the source file to determine
70
70
  # the column. The pattern is a regex without the enclosing slashes '/'
71
71
  # key:: first column value from the source file that is used as first
72
- # column in the target file
72
+ # column in the target file. The key is optional.
73
73
  def initialize(options = {})
74
74
  @outfile = options[:outfile]
75
75
  @header_cols = options[:header].split(',')
76
76
  @source_header = options[:source_header].split(',')
77
- @key = options[:key].split(',')
77
+ @key = options[:key] ? options[:key].split(',') : []
78
+ @has_key = !@key.empty?
78
79
  @files = options[:files].split(',')
80
+ if @source_header.count != @files.count
81
+ raise "file count has to be equal to source_header count"
82
+ end
79
83
  end
80
84
 
81
85
  # Merges the files based on the provided parameters
82
86
  def execute
83
87
  File.open(outfile, 'w') do |out|
84
- out.puts ";#{header_cols.join(';')}"
88
+ out.puts "#{';' unless @key.empty?}#{header_cols.join(';')}"
85
89
  files.each do |file|
86
- @current_key = @key.shift
90
+ @current_key = create_current_key
87
91
  @current_source_header = @source_header.shift
88
92
  processed_header = false
89
93
  File.open(file).each_with_index do |line, index|
@@ -110,16 +114,25 @@ module Sycsvpro
110
114
  columns[i] = c.scan(Regexp.new(@current_source_header)).flatten[0]
111
115
  end
112
116
 
113
- @file_header = [@current_key.to_i]
117
+ @file_header = @current_key ? [@current_key.to_i] : []
118
+
114
119
  header_cols.each do |h|
115
120
  @file_header << columns.index(h)
116
121
  end
122
+
117
123
  @file_header.compact!
118
124
  end
119
125
 
126
+ # create the current key dependent on the value returns a number or nil
127
+ def create_current_key
128
+ key = @key.shift
129
+ key.nil? || key.strip.empty? ? nil : key
130
+ end
131
+
120
132
  # create a line filtered by the file_header
121
133
  def create_line(columns)
122
- columns.values_at(*@file_header).join(';')
134
+ empty_col = ';' if @has_key && @current_key.nil?
135
+ "#{empty_col}#{columns.values_at(*@file_header).join(';')}"
123
136
  end
124
137
 
125
138
  end
@@ -0,0 +1,77 @@
1
+ # Operating csv files
2
+ module Sycsvpro
3
+
4
+ # Tranposes rows to columns and vice versa
5
+ #
6
+ # Example
7
+ #
8
+ # infile.csv
9
+ # | Year | SP | RP | Total | SP-O | RP-O | O |
10
+ # | ---- | -- | -- | ----- | ---- | ---- | --- |
11
+ # | | 10 | 20 | 30 | 100 | 40 | 140 |
12
+ # | 2008 | 5 | 10 | 15 | 10 | 20 | 10 |
13
+ # | 2009 | 2 | 5 | 5 | 20 | 10 | 30 |
14
+ # | 2010 | 3 | 5 | 10 | 70 | 10 | 100 |
15
+ #
16
+ # outfile.csv
17
+ # | Year | | 2008 | 2009 | 2010 |
18
+ # | ----- | --- | ---- | ---- | ---- |
19
+ # | SP | 10 | 5 | 5 | 3 |
20
+ # | RP | 20 | 10 | 10 | 5 |
21
+ # | Total | 30 | 15 | 15 | 10 |
22
+ # | SP-O | 100 | 10 | 10 | 70 |
23
+ # | RP-O | 40 | 20 | 20 | 10 |
24
+ # | O | 140 | 10 | 30 | 100 |
25
+ #
26
+ class Transposer
27
+
28
+ include Dsl
29
+
30
+ # infile contains the data that is operated on
31
+ attr_reader :infile
32
+ # outfile is the file where the result is written to
33
+ attr_reader :outfile
34
+ # filter that is used for rows
35
+ attr_reader :row_filter
36
+ # filter that is used for columns
37
+ attr_reader :col_filter
38
+
39
+ # Create a new Transpose
40
+ # :call-seq:
41
+ # Sycsvpro::Transpose(infile: "infile.csv",
42
+ # outfile: "outfile.csv",
43
+ # rows: "0,3-5",
44
+ # cols: "1,3").execute
45
+ def initialize(options = {})
46
+ @infile = options[:infile]
47
+ @outfile = options[:outfile]
48
+ @row_filter = RowFilter.new(options[:rows])
49
+ @col_filter = ColumnFilter.new(options[:cols])
50
+ end
51
+
52
+ # Executes the transpose by reading the infile and writing the result to
53
+ # the outfile
54
+ def execute
55
+ transpose = {}
56
+
57
+ File.open(@infile).each_with_index do |line, index|
58
+ line = unstring(line)
59
+ next if line.empty?
60
+
61
+ result = @col_filter.process(@row_filter.process(line, row: index))
62
+ next if result.nil?
63
+
64
+ result.split(';').each_with_index do |col, index|
65
+ transpose[index] ||= []
66
+ transpose[index] << col
67
+ end
68
+ end
69
+
70
+ File.open(@outfile, 'w') do |out|
71
+ transpose.values.each { |value| out.puts value.join(';') }
72
+ end
73
+ end
74
+
75
+ end
76
+
77
+ end
@@ -1,5 +1,5 @@
1
1
  # Operating csv files
2
2
  module Sycsvpro
3
3
  # Version number of sycsvpro
4
- VERSION = '0.1.12'
4
+ VERSION = '0.1.13'
5
5
  end
data/lib/sycsvpro.rb CHANGED
@@ -17,3 +17,4 @@ require 'sycsvpro/table.rb'
17
17
  require 'sycsvpro/join.rb'
18
18
  require 'sycsvpro/merger.rb'
19
19
  require 'sycsvpro/unique.rb'
20
+ require 'sycsvpro/transposer.rb'
@@ -12,6 +12,96 @@ module Sycsvpro
12
12
  @out_file = File.join(File.dirname(__FILE__), "files/machines_out.csv")
13
13
  end
14
14
 
15
+ it "should ignore colons within calculation expression" do
16
+ cols = "3:+[c1,c2].inject(:+),4:c2*3"
17
+ header = "*,times"
18
+
19
+ calculator = Calculator.new(infile: @in_number_file,
20
+ outfile: @out_file,
21
+ header: header,
22
+ cols: cols)
23
+
24
+ calculator.execute
25
+
26
+ result = [ "customer;before;between;after;times",
27
+ "Fink;2;3;6;9",
28
+ "Haas;3;1;10;3",
29
+ "Gent;4;4;12;12",
30
+ "Rank;5;4;10;12" ]
31
+
32
+ rows = 0
33
+
34
+ File.open(@out_file).each_with_index do |line, index|
35
+ line.chomp.should eq result[index]
36
+ rows += 1
37
+ end
38
+
39
+ rows.should eq result.size
40
+ end
41
+
42
+ it "should save only specified columns" do
43
+ cols = "3:+[c1,c2].inject(:+),4:c3*3"
44
+ write = "0,3-4"
45
+ header = "customer;sum;times"
46
+
47
+ calculator = Calculator.new(infile: @in_number_file,
48
+ outfile: @out_file,
49
+ header: header,
50
+ final_header: true,
51
+ write: write,
52
+ cols: cols,
53
+ sum: true)
54
+
55
+ calculator.execute
56
+
57
+ result = [ "customer;sum;times",
58
+ "Fink;6;18",
59
+ "Haas;10;30",
60
+ "Gent;12;36",
61
+ "Rank;10;30",
62
+ "0;38;114" ]
63
+
64
+ rows = 0
65
+
66
+ File.open(@out_file).each_with_index do |line, index|
67
+ line.chomp.should eq result[index]
68
+ rows += 1
69
+ end
70
+
71
+ rows.should eq result.size
72
+ end
73
+
74
+ it "should save only specified columns" do
75
+ cols = "3:+[c1,c2].inject(:+),4:c3*3"
76
+ write = "0,3-4"
77
+ header = "*,times"
78
+
79
+ calculator = Calculator.new(infile: @in_number_file,
80
+ outfile: @out_file,
81
+ header: header,
82
+ write: write,
83
+ cols: cols,
84
+ sum: true)
85
+
86
+ calculator.execute
87
+
88
+ result = [ "customer;after;times",
89
+ "Fink;6;18",
90
+ "Haas;10;30",
91
+ "Gent;12;36",
92
+ "Rank;10;30",
93
+ "0;38;114" ]
94
+
95
+ rows = 0
96
+
97
+ File.open(@out_file).each_with_index do |line, index|
98
+ line.chomp.should eq result[index]
99
+ rows += 1
100
+ end
101
+
102
+ rows.should eq result.size
103
+ end
104
+
15
105
  it "should operate on existing row" do
16
106
  rows = "2-8"
17
107
  cols = "3:*3,4:*4+1"
@@ -6,12 +6,16 @@ module Sycsvpro
6
6
 
7
7
  before do
8
8
  @in_file = File.join(File.dirname(__FILE__), "files/in.csv")
9
+ @in_file5 = File.join(File.dirname(__FILE__), "files/in5.csv")
9
10
  @out_file = File.join(File.dirname(__FILE__), "files/out.csv")
10
11
  @mappings = File.join(File.dirname(__FILE__), "files/mappings")
11
12
  end
12
13
 
13
- it "should map values to new values" do
14
- mapper = Mapper.new(infile: @in_file, outfile: @out_file, mapping: @mappings)
14
+ it "should map values to new values in all columns" do
15
+ mapper = Mapper.new(infile: @in_file,
16
+ outfile: @out_file,
17
+ rows: "0-7",
18
+ mapping: @mappings)
15
19
 
16
20
  mapper.execute
17
21
 
@@ -30,6 +34,60 @@ module Sycsvpro
30
34
 
31
35
  end
32
36
 
37
+ it "should map values to new values on specified columns only" do
38
+ mapper = Mapper.new(infile: @in_file,
39
+ outfile: @out_file,
40
+ rows: "0-7",
41
+ cols: "4",
42
+ mapping: @mappings).execute
43
+
44
+ result = [ "customer;contract-number;expires-on;machine;product1;product2",
45
+ "Fink;1234;20.12.2015;f1;control123;dri222",
46
+ "Haas;3322;1.10.2011;h1;control332;dri111",
47
+ "Gent;4323;1.3.2014;g1;control123;dri111",
48
+ "Fink;1234;30.12.2016;f2;control333;dri321",
49
+ "Rank;3232;1.5.2013;r1;control332;dri321",
50
+ "Klig;4432;;k1;control332;dri222",
51
+ "fink;1234;;f3;control332;dri321" ]
52
+
53
+ rows = 0
54
+
55
+ File.open(@out_file).each_with_index do |line, index|
56
+ line.chomp.should eq result[index]
57
+ rows += 1
58
+ end
59
+
60
+ rows.should eq result.size
61
+
62
+ end
63
+
64
+ it "should map values to new values where last column is empty" do
65
+ mapper = Mapper.new(infile: @in_file5,
66
+ outfile: @out_file,
67
+ cols: "5",
68
+ mapping: @mappings).execute
69
+
70
+ result = [ "customer;contract-number;expires-on;machine;product1;product2",
71
+ "Fink;1234;20.12.2015;f1;con123;drive222",
72
+ "Haas;3322;1.10.2011;h1;con332;drive111",
73
+ "Gent;4323;1.3.2014;g1;con123;drive111",
74
+ "Fink;1234;30.12.2016;f2;con333;drive321",
75
+ "Rank;3232;1.5.2013;r1;con332;drive321",
76
+ "Klig;4432;;k1;con332;drive222",
77
+ "fink;1234;;f3;con332;drive321",
78
+ "zink;8839;8.8.2018;z3;con332;" ]
79
+
80
+ rows = 0
81
+
82
+ File.open(@out_file).each_with_index do |line, index|
83
+ line.chomp.should eq result[index]
84
+ rows += 1
85
+ end
86
+
87
+ rows.should eq result.size
88
+
89
+ end
90
+
33
91
  end
34
92
 
35
93
  end
@@ -7,6 +7,8 @@ module Sycsvpro
7
7
  before do
8
8
  @file1 = File.join(File.dirname(__FILE__), "files/merge1.csv")
9
9
  @file2 = File.join(File.dirname(__FILE__), "files/merge2.csv")
10
+ @file3 = File.join(File.dirname(__FILE__), "files/merge3.csv")
11
+ @file4 = File.join(File.dirname(__FILE__), "files/merge4.csv")
10
12
  @outfile = File.join(File.dirname(__FILE__), "files/merged.csv")
11
13
  end
12
14
 
@@ -100,6 +102,97 @@ module Sycsvpro
100
102
  rows.should eq result.size
101
103
  end
102
104
 
105
+ it "should merge two files without key columns" do
106
+ header = "2010,2011,2012,2014"
107
+ source_header = "(\\d{4}),(\\d{4})"
108
+
109
+ Sycsvpro::Merger.new(outfile: @outfile,
110
+ files: "#{@file4},#{@file3}",
111
+ header: header,
112
+ source_header: source_header).execute
113
+
114
+ result = [ "2010;2011;2012;2014",
115
+ "20;30;40;60",
116
+ "30;40;50;70",
117
+ "40;50;60;80",
118
+ "50;60;70;90",
119
+ "m1;m2;m3",
120
+ "n1;n2;n3",
121
+ "o1;;o3", ]
122
+
123
+ rows = 0
124
+
125
+ File.open(@outfile).each_with_index do |row, index|
126
+ row.chomp.should eq result[index]
127
+ rows += 1
128
+ end
129
+
130
+ rows.should eq result.size
131
+ end
132
+
133
+ it "should merge two files key columns in one file only" do
134
+ header = "2010,2011,2012,2014"
135
+ key = "0"
136
+ source_header = "(\\d{4}),(\\d{4})"
137
+
138
+ Sycsvpro::Merger.new(outfile: @outfile,
139
+ files: "#{@file1},#{@file3}",
140
+ header: header,
141
+ key: key,
142
+ source_header: source_header).execute
143
+
144
+ result = [ ";2010;2011;2012;2014",
145
+ "SP;20;30;40;60",
146
+ "RP;30;40;50;70",
147
+ "MP;40;50;60;80",
148
+ "NP;50;60;70;90",
149
+ ";m1;m2;m3",
150
+ ";n1;n2;n3",
151
+ ";o1;;o3", ]
152
+
153
+ rows = 0
154
+
155
+ File.open(@outfile).each_with_index do |row, index|
156
+ row.chomp.should eq result[index]
157
+ rows += 1
158
+ end
159
+
160
+ rows.should eq result.size
161
+ end
162
+
163
+ it "should merge two files key columns in two files of three only" do
164
+ header = "2010,2011,2012,2014"
165
+ key = "0, ,0"
166
+ source_header = "(\\d{4}),(\\d{4}),(\\d{4})"
167
+
168
+ Sycsvpro::Merger.new(outfile: @outfile,
169
+ files: "#{@file1},#{@file3},#{@file2}",
170
+ header: header,
171
+ key: key,
172
+ source_header: source_header).execute
173
+
174
+ result = [ ";2010;2011;2012;2014",
175
+ "SP;20;30;40;60",
176
+ "RP;30;40;50;70",
177
+ "MP;40;50;60;80",
178
+ "NP;50;60;70;90",
179
+ ";m1;m2;m3",
180
+ ";n1;n2;n3",
181
+ ";o1;;o3",
182
+ "M;m1;m2;m3",
183
+ "N;n1;n2;n3",
184
+ "O;o1;;o3" ]
185
+
186
+ rows = 0
187
+
188
+ File.open(@outfile).each_with_index do |row, index|
189
+ row.chomp.should eq result[index]
190
+ rows += 1
191
+ end
192
+
193
+ rows.should eq result.size
194
+ end
195
+
103
196
  end
104
197
 
105
198
  end
@@ -0,0 +1,76 @@
1
+ require 'sycsvpro/transposer'
2
+
3
+ module Sycsvpro
4
+
5
+ describe Transposer do
6
+
7
+ before do
8
+ @infile = File.join(File.dirname(__FILE__), 'files/in6.csv')
9
+ @outfile = File.join(File.dirname(__FILE__), 'files/out.csv')
10
+ end
11
+
12
+ it "should transpose (change rows to columns) complete file" do
13
+ Sycsvpro::Transposer.new(infile: @infile,
14
+ outfile: @outfile).execute
15
+
16
+ result = [ "Year;;2008;2009;2010",
17
+ "SP;10;5;2;3",
18
+ "RP;20;10;5;5",
19
+ "Total;30;15;5;10",
20
+ "SP-O;100;10;20;70",
21
+ "RP-O;40;20;10;10",
22
+ "O;140;10;30;100" ]
23
+
24
+ rows = 0
25
+
26
+ File.open(@outfile).each_with_index do |line, i|
27
+ line.chomp.should eq result[i]
28
+ rows += 1
29
+ end
30
+
31
+ rows.should eq result.size
32
+ end
33
+
34
+ it "should transpose selected columns" do
35
+ Sycsvpro::Transposer.new(infile: @infile,
36
+ outfile: @outfile,
37
+ cols: "0-2").execute
38
+
39
+ result = [ "Year;;2008;2009;2010",
40
+ "SP;10;5;2;3",
41
+ "RP;20;10;5;5" ]
42
+
43
+ rows = 0
44
+
45
+ File.open(@outfile).each_with_index do |line, i|
46
+ line.chomp.should eq result[i]
47
+ rows += 1
48
+ end
49
+
50
+ rows.should eq result.size
51
+ end
52
+
53
+ it "should transpose selected rows and columns" do
54
+ Sycsvpro::Transposer.new(infile: @infile,
55
+ outfile: @outfile,
56
+ rows: "0,2-4",
57
+ cols: "0-2").execute
58
+
59
+ result = [ "Year;2008;2009;2010",
60
+ "SP;5;2;3",
61
+ "RP;10;5;5" ]
62
+
63
+ rows = 0
64
+
65
+ File.open(@outfile).each_with_index do |line, i|
66
+ line.chomp.should eq result[i]
67
+ rows += 1
68
+ end
69
+
70
+ rows.should eq result.size
71
+ end
72
+
73
+
74
+ end
75
+
76
+ end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: sycsvpro
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.12
4
+ version: 0.1.13
5
5
  prerelease:
6
6
  platform: ruby
7
7
  authors:
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: bin
11
11
  cert_chain: []
12
- date: 2014-07-09 00:00:00.000000000 Z
12
+ date: 2014-07-14 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: rake
@@ -151,6 +151,7 @@ files:
151
151
  - lib/sycsvpro/script_list.rb
152
152
  - lib/sycsvpro/sorter.rb
153
153
  - lib/sycsvpro/table.rb
154
+ - lib/sycsvpro/transposer.rb
154
155
  - lib/sycsvpro/unique.rb
155
156
  - lib/sycsvpro/version.rb
156
157
  - spec/sycsvpro/aggregator_spec.rb
@@ -175,6 +176,7 @@ files:
175
176
  - spec/sycsvpro/script_list_spec.rb
176
177
  - spec/sycsvpro/sorter_spec.rb
177
178
  - spec/sycsvpro/table_spec.rb
179
+ - spec/sycsvpro/transposer_spec.rb
178
180
  - spec/sycsvpro/unique_spec.rb
179
181
  - sycsvpro.gemspec
180
182
  - sycsvpro.rdoc