ar_loader 0.0.4
Sign up to get free protection for your applications and to get access to all the features.
- data/LICENSE +9 -0
- data/README.markdown +211 -0
- data/Rakefile +76 -0
- data/lib/VERSION +1 -0
- data/lib/ar_loader.rb +53 -0
- data/lib/engine/file_definitions.rb +353 -0
- data/lib/engine/jruby/jexcel_file.rb +182 -0
- data/lib/engine/jruby/method_mapper_excel.rb +44 -0
- data/lib/engine/mapping_file_definitions.rb +88 -0
- data/lib/engine/method_detail.rb +139 -0
- data/lib/engine/method_mapper.rb +157 -0
- data/lib/engine/method_mapper_csv.rb +28 -0
- data/lib/engine/word.rb +70 -0
- data/lib/java/poi-3.2-FINAL-20081019.jar +0 -0
- data/lib/java/poi-3.6.jar +0 -0
- data/lib/java/poi-contrib-3.6-20091214.jar +0 -0
- data/lib/java/poi-examples-3.6-20091214.jar +0 -0
- data/lib/java/poi-ooxml-3.6-20091214.jar +0 -0
- data/lib/java/poi-ooxml-schemas-3.6-20091214.jar +0 -0
- data/lib/java/poi-scratchpad-3.6-20091214.jar +0 -0
- data/lib/loaders/loader_base.rb +61 -0
- data/lib/loaders/spree/image_loader.rb +47 -0
- data/lib/loaders/spree/product_loader.rb +93 -0
- data/lib/to_b.rb +24 -0
- data/spec/excel_loader_spec.rb +138 -0
- data/spec/spec_helper.rb +37 -0
- data/tasks/db_tasks.rake +65 -0
- data/tasks/excel_loader.rake +101 -0
- data/tasks/file_tasks.rake +38 -0
- data/tasks/seed_fu_product_template.erb +15 -0
- data/tasks/spree/image_load.rake +103 -0
- data/tasks/spree/product_loader.rake +107 -0
- data/tasks/tidy_config.txt +13 -0
- data/tasks/word_to_seedfu.rake +167 -0
- metadata +90 -0
data/LICENSE
ADDED
data/README.markdown
ADDED
@@ -0,0 +1,211 @@
|
|
1
|
+
# AR Loader
|
2
|
+
|
3
|
+
General Active Record Loader with current focus on support for Spree.
|
4
|
+
|
5
|
+
Maps column headings to attributes and associations.
|
6
|
+
|
7
|
+
Fully extendable via spreadsheet headings - simply add new column to Excel with
|
8
|
+
attribute or association name, and loader will attempt to
|
9
|
+
find correct association and populate AR object with row data.
|
10
|
+
|
11
|
+
Can handle human read-able forms of column names. For example, given an association on AR model called,
|
12
|
+
product_properties, will map from column headings such as 'product_properties',
|
13
|
+
'Product Properties', 'product properties' etc
|
14
|
+
|
15
|
+
## Installation
|
16
|
+
|
17
|
+
Add gem instruction to your Gemfile. To use the Excel loader, JRuby is required, so to use in a mixed setup
|
18
|
+
of JRuby and deployed to other Rubies, use the following guard.
|
19
|
+
|
20
|
+
if(RUBY_PLATFORM =~ /java/)
|
21
|
+
gem 'activerecord-jdbcmysql-adapter'
|
22
|
+
else
|
23
|
+
gem 'mysql'
|
24
|
+
end
|
25
|
+
|
26
|
+
Currently not tested AR usage outside a Rails Project but to install the l;atest gem :
|
27
|
+
|
28
|
+
`gem install ar_loader`
|
29
|
+
|
30
|
+
To pull the tasks in, add call in your Rakefile to :
|
31
|
+
|
32
|
+
ArLoader::require_tasks
|
33
|
+
|
34
|
+
## Example Spreadsheet
|
35
|
+
|
36
|
+
An example Spreadsheet with headers and comments, suitable for giving to Clients
|
37
|
+
to populate, can be found in test/examples/DemoSpreadsheet.xls
|
38
|
+
|
39
|
+
## Features
|
40
|
+
|
41
|
+
- *Direct Excel support*
|
42
|
+
|
43
|
+
Includes a wrapper around MS Excel via Apache POI, which
|
44
|
+
enables Products to be loaded directly from Excel via JRuby. No need to save to CSV first.
|
45
|
+
|
46
|
+
The java jars e.g - 'poi-3.6.jar' - are included.
|
47
|
+
|
48
|
+
- *Semi-Smart Name Lookup*
|
49
|
+
|
50
|
+
Includes helper classes that find and store details of all possible associations on an AR class.
|
51
|
+
Given a user supplied name, attempts to find the requested association.
|
52
|
+
|
53
|
+
Example usage, load from a file or spreadsheet where the column names are only
|
54
|
+
an approximation of the actual associations, so given 'Product Properties' heading,
|
55
|
+
finds real association 'product_properties' to send or call on the AR object
|
56
|
+
|
57
|
+
- *Associations*
|
58
|
+
|
59
|
+
Can handle 'many' type associations and enables multiple association objects to
|
60
|
+
be added via single entry (column). See Details section.
|
61
|
+
|
62
|
+
- *Spree Rake Tasks*
|
63
|
+
|
64
|
+
Rake tasks provided for Spree loading - currently supports Product with associations,
|
65
|
+
and Image loading.
|
66
|
+
|
67
|
+
**Product loading from Excel specifically requires JRuby**.
|
68
|
+
|
69
|
+
Example command lines:
|
70
|
+
|
71
|
+
rake excel_load input=vendor\extensions\autotelik\fixtures\ExampleInfoWeb.xls
|
72
|
+
|
73
|
+
rake excel_load input=C:\MyProducts.xls verbose=true'
|
74
|
+
|
75
|
+
- *Seamless Image loading can be achieved by ensuring SKU or class Name features in Image filename.
|
76
|
+
|
77
|
+
Lookup is performed either via the SKU being prepended to the image name, or by the image name being equal to the **name attribute** of the klass in question.
|
78
|
+
|
79
|
+
Images can be attached to any class defined with a suitable association. The class to use can be configured in rake task via
|
80
|
+
parameter klass=Xyz.
|
81
|
+
|
82
|
+
In the Spree tasks, this defaults to Product, so attempts to attach Image to a Product via Product SKU or Name.
|
83
|
+
|
84
|
+
Image loading **does not** specifically require JRuby
|
85
|
+
|
86
|
+
A report is generated in the current working directory detailing any Images in the paths that could not be matched with a Product.
|
87
|
+
|
88
|
+
Example cmd lines :
|
89
|
+
|
90
|
+
rake image_load input=vendor\extensions\autotelik\lib\fixtures
|
91
|
+
rake image_load input="C:\images\Paintings' dummy=true
|
92
|
+
rake image_load input="C:\images\TaxonIcons" skip_if_no_assoc=true klass=Taxon
|
93
|
+
|
94
|
+
## Example Wrapper Tasks for Spree Site Extension
|
95
|
+
|
96
|
+
These tasks show how to write your own high level wrapper task, that will seed the database from multiple spreedsheets.
|
97
|
+
|
98
|
+
The images in this example have been named with the SKU present in name (separated by whitespace) e.g "PRINT_001 Stonehenge.jpg"
|
99
|
+
|
100
|
+
A report is generated in the current working directory detailing any Images in the paths that could not be matched with a Product.
|
101
|
+
|
102
|
+
require 'ar_loader'
|
103
|
+
|
104
|
+
namespace :mysite do
|
105
|
+
|
106
|
+
desc "Load Products for site"
|
107
|
+
task :load, :needs => [:environment] do |t, args|
|
108
|
+
|
109
|
+
[ "vendor/extensions/site/db/seed/Paintings.xls",
|
110
|
+
"vendor/extensions/site/db/seed/Drawings.xls"
|
111
|
+
].each do |x|
|
112
|
+
Rake::Task['autotelik:excel_load'].execute(
|
113
|
+
:input => x,
|
114
|
+
:verbose => true,
|
115
|
+
:sku_prefix => ""
|
116
|
+
)
|
117
|
+
end
|
118
|
+
end
|
119
|
+
|
120
|
+
desc "Load Images for site based on SKU"
|
121
|
+
task :load_images, :clean, :dummy, :needs => [:environment] do |t, args|
|
122
|
+
|
123
|
+
if(args[:clean])
|
124
|
+
Image.delete_all
|
125
|
+
FileUtils.rm_rf( "public/assests/products" )
|
126
|
+
end
|
127
|
+
|
128
|
+
["01_paintings_jpegs", "02_drawings_jpegs"].each do |x|
|
129
|
+
|
130
|
+
# image names start with associated Product SKU,
|
131
|
+
# skip rather then exit if no matching product found
|
132
|
+
|
133
|
+
Rake::Task['autotelik:image_load'].execute(
|
134
|
+
:input => "/my_site_load_info//#{x}",
|
135
|
+
:dummy => args[:dummy],
|
136
|
+
:verbose => false, :sku => true, :skip_if_no_assoc => true
|
137
|
+
)
|
138
|
+
end
|
139
|
+
end
|
140
|
+
|
141
|
+
## Details
|
142
|
+
|
143
|
+
### Associations
|
144
|
+
|
145
|
+
A single association column can contain multiple name/value sets in default form :
|
146
|
+
|
147
|
+
Name1:value1, value2|Name2:value1, value2, value3|Name3:value1, value2 etc
|
148
|
+
|
149
|
+
So for example a Column for an 'Option Types' association on a Product,
|
150
|
+
could contain 2 options with a number of values each :
|
151
|
+
|
152
|
+
'Option Types'
|
153
|
+
size:small,medium,large|colour:red,white
|
154
|
+
size:small|colour:blue,red,white
|
155
|
+
|
156
|
+
### Properties
|
157
|
+
|
158
|
+
The properties to associate with this product.
|
159
|
+
Properties are for small snippets of text, shared across many products,
|
160
|
+
and are for display purposes only.
|
161
|
+
|
162
|
+
An optional display value can be supplied to supplement the displayed text.
|
163
|
+
|
164
|
+
As for all associations can contain multiple name/value sets in default form :
|
165
|
+
|
166
|
+
Property:display_value|Property:display_value
|
167
|
+
|
168
|
+
Example - No values :
|
169
|
+
manufacturer|standard
|
170
|
+
|
171
|
+
Example - Display values :
|
172
|
+
manufacturer:somebody else plc|standard:ISOBlah21
|
173
|
+
|
174
|
+
## TODO
|
175
|
+
|
176
|
+
- Make more generic, so have smart switching to Ruby and directly support csv,
|
177
|
+
when JRuby and/or Excel not available.
|
178
|
+
|
179
|
+
- Smart sorting of column processing order ....
|
180
|
+
|
181
|
+
Does not currently ensure mandatory columns (for valid?) processed first.
|
182
|
+
Since Product needs saving before associations can be processed, user currently
|
183
|
+
needs to ensure SKU, name, price columns are among first columns
|
184
|
+
|
185
|
+
## License
|
186
|
+
|
187
|
+
Copyright:: (c) Autotelik Media Ltd 2011
|
188
|
+
|
189
|
+
Author :: Tom Statter
|
190
|
+
|
191
|
+
Date :: Feb 2011
|
192
|
+
|
193
|
+
The MIT License
|
194
|
+
|
195
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
196
|
+
of this software and associated documentation files (the "Software"), to deal
|
197
|
+
in the Software without restriction, including without limitation the rights
|
198
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
199
|
+
copies of the Software, and to permit persons to whom the Software is
|
200
|
+
furnished to do so, subject to the following conditions:
|
201
|
+
|
202
|
+
The above copyright notice and this permission notice shall be included in
|
203
|
+
all copies or substantial portions of the Software.
|
204
|
+
|
205
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
206
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
207
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
208
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
209
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
210
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
211
|
+
THE SOFTWARE.
|
data/Rakefile
ADDED
@@ -0,0 +1,76 @@
|
|
1
|
+
require 'rubygems'
|
2
|
+
require 'rake'
|
3
|
+
require 'rake/clean'
|
4
|
+
require 'rake/gempackagetask'
|
5
|
+
require 'rake/rdoctask'
|
6
|
+
require 'rake/testtask'
|
7
|
+
require "lib/ar_loader"
|
8
|
+
|
9
|
+
# Copyright:: (c) Autotelik Media Ltd 2011
|
10
|
+
# Author :: Tom Statter
|
11
|
+
# Date :: Aug 2010
|
12
|
+
#
|
13
|
+
# License:: MIT - Free, OpenSource
|
14
|
+
#
|
15
|
+
# Details:: Gem::Specification for Active Record Loader gem.
|
16
|
+
#
|
17
|
+
# Specifically enabled for uploading Spree products but easily
|
18
|
+
# extended to any AR model.
|
19
|
+
#
|
20
|
+
# Currently support direct access to Excel Spreedsheets via JRuby
|
21
|
+
#
|
22
|
+
# TODO - Switch for non JRuby Rubies, enable load via CSV file instead of Excel.
|
23
|
+
#
|
24
|
+
ArLoader::require_tasks
|
25
|
+
|
26
|
+
spec = Gem::Specification.new do |s|
|
27
|
+
s.name = ArLoader.gem_name
|
28
|
+
s.version = ArLoader.gem_version
|
29
|
+
s.has_rdoc = true
|
30
|
+
s.extra_rdoc_files = ['README.markdown', 'LICENSE']
|
31
|
+
s.summary = 'File based loader for Active Record models'
|
32
|
+
s.description = 'A file based loader for Active Record models. Seed database directly from Excel/CSV. Includes rake support for Spree'
|
33
|
+
s.author = 'thomas statter'
|
34
|
+
s.email = 'rubygems@autotelik.co.uk'
|
35
|
+
s.date = DateTime.now.strftime("%Y-%m-%d")
|
36
|
+
s.homepage = %q{http://www.autotelik.co.uk}
|
37
|
+
|
38
|
+
# s.executables = ['your_executable_here']
|
39
|
+
s.files = %w(LICENSE README.markdown Rakefile) + Dir.glob("{lib,spec,tasks}/**/*")
|
40
|
+
s.require_path = "lib"
|
41
|
+
s.bindir = "bin"
|
42
|
+
end
|
43
|
+
|
44
|
+
Rake::GemPackageTask.new(spec) do |p|
|
45
|
+
p.gem_spec = spec
|
46
|
+
p.need_tar = true
|
47
|
+
p.need_zip = true
|
48
|
+
end
|
49
|
+
|
50
|
+
Rake::RDocTask.new do |rdoc|
|
51
|
+
files =['README.markdown', 'LICENSE', 'lib/**/*.rb']
|
52
|
+
rdoc.rdoc_files.add(files)
|
53
|
+
rdoc.main = "README.markdown" # page to start on
|
54
|
+
rdoc.title = "ARLoader Docs"
|
55
|
+
rdoc.rdoc_dir = 'doc/rdoc' # rdoc output folder
|
56
|
+
rdoc.options << '--line-numbers'
|
57
|
+
end
|
58
|
+
|
59
|
+
Rake::TestTask.new do |t|
|
60
|
+
t.test_files = FileList['test/**/*.rb']
|
61
|
+
end
|
62
|
+
|
63
|
+
# Add in our own Tasks
|
64
|
+
|
65
|
+
# Long parameter lists so ensure rake -T produces nice wide output
|
66
|
+
ENV['RAKE_COLUMNS'] = '180'
|
67
|
+
|
68
|
+
desc 'Build gem and install in one step'
|
69
|
+
task :pik_install, :needs => [:gem] do |t, args|
|
70
|
+
|
71
|
+
puts "Installing version #{ArLoader.gem_version}"
|
72
|
+
|
73
|
+
gem = "#{ArLoader.gem_name}-#{ArLoader.gem_version}.gem"
|
74
|
+
cmd = "pik gem install --no-ri --no-rdoc pkg\\#{gem}"
|
75
|
+
system(cmd)
|
76
|
+
end
|
data/lib/VERSION
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
0.0.4
|
data/lib/ar_loader.rb
ADDED
@@ -0,0 +1,53 @@
|
|
1
|
+
# Copyright:: (c) Autotelik Media Ltd 2011
|
2
|
+
# Author :: Tom Statter
|
3
|
+
# Date :: Aug 2010
|
4
|
+
# License:: TBD. Free, Open Source. MIT ?
|
5
|
+
#
|
6
|
+
# Details:: Active Record Loader
|
7
|
+
#
|
8
|
+
module ArLoader
|
9
|
+
|
10
|
+
def self.gem_version
|
11
|
+
@gem_version ||= File.read( File.join( root_path, 'lib', 'VERSION') ).chomp
|
12
|
+
@gem_version
|
13
|
+
end
|
14
|
+
|
15
|
+
def self.gem_name
|
16
|
+
"ar_loader"
|
17
|
+
end
|
18
|
+
|
19
|
+
def self.root_path
|
20
|
+
File.expand_path("#{File.dirname(__FILE__)}/..")
|
21
|
+
end
|
22
|
+
|
23
|
+
def self.require_libraries
|
24
|
+
|
25
|
+
loader_libs = %w{ lib }
|
26
|
+
|
27
|
+
# Base search paths - these will be searched recursively and any xxx.rake files autoimported
|
28
|
+
loader_paths = []
|
29
|
+
|
30
|
+
loader_libs.each {|lib| loader_paths << File.join(root_path, lib) }
|
31
|
+
|
32
|
+
# Define require search paths, any dir in here will be added to LOAD_PATH
|
33
|
+
|
34
|
+
loader_paths.each do |base|
|
35
|
+
$:.unshift base if File.directory?(base)
|
36
|
+
Dir[File.join(base, '**', '**')].each do |p|
|
37
|
+
if File.directory? p
|
38
|
+
$:.unshift p
|
39
|
+
end
|
40
|
+
end
|
41
|
+
end
|
42
|
+
end
|
43
|
+
|
44
|
+
def self.require_tasks
|
45
|
+
# Long parameter lists so ensure rake -T produces nice wide output
|
46
|
+
ENV['RAKE_COLUMNS'] = '180'
|
47
|
+
base = File.join(root_path, 'tasks', '**')
|
48
|
+
Dir["#{base}/*.rake"].sort.each { |ext| load ext }
|
49
|
+
end
|
50
|
+
|
51
|
+
end
|
52
|
+
|
53
|
+
ArLoader::require_libraries
|
@@ -0,0 +1,353 @@
|
|
1
|
+
# Copyright:: (c) Autotelik Media Ltd 2011
|
2
|
+
# Author :: Tom Statter
|
3
|
+
# Date :: Jan 2011
|
4
|
+
# License:: MIT
|
5
|
+
#
|
6
|
+
# Details:: This module acts as helpers for defining input/output file formats as classes.
|
7
|
+
#
|
8
|
+
# It provides a simple interface to define a file structure - field by field.
|
9
|
+
#
|
10
|
+
# By defining the structure, following methods and attributes are mixed in :
|
11
|
+
#
|
12
|
+
# An attribute, with accessor for each field/column.
|
13
|
+
# Parse a line, assigning values to each attribute.
|
14
|
+
# Parse an instance of that file line by line, accepts a block in which data can be processed.
|
15
|
+
# Method to split a file by field.
|
16
|
+
# Method to perform replace operations on a file by field and value.
|
17
|
+
#
|
18
|
+
# Either delimited or a fixed width definition can be created via macro-like class methods :
|
19
|
+
#
|
20
|
+
# create_field_definition [field_list]
|
21
|
+
#
|
22
|
+
# create_fixed_definition {field => range }
|
23
|
+
#
|
24
|
+
# Member attributes, with getters and setters, can be added for each field defined above via class method :
|
25
|
+
#
|
26
|
+
# create_field_attr_accessors
|
27
|
+
#
|
28
|
+
# USAGE :
|
29
|
+
#
|
30
|
+
# Create a class that contains definition of a file.
|
31
|
+
#
|
32
|
+
# class ExampleFixedWith < FileDefinitionBase
|
33
|
+
# create_fixed_definition(:name => (0..7), :value => (8..15), :ccy => (16..18), :dr_or_cr => (19..19) )
|
34
|
+
#
|
35
|
+
# create_field_attr_accessors
|
36
|
+
# end
|
37
|
+
#
|
38
|
+
# class ExampleCSV < FileDefinitionBase
|
39
|
+
# create_field_definition %w{abc def ghi jkl}
|
40
|
+
#
|
41
|
+
# create_field_attr_accessors
|
42
|
+
# end
|
43
|
+
#
|
44
|
+
# Any instance can then be used to parse the defined file type, with each field or column value
|
45
|
+
# being assigned automatically to the associated instance variable.
|
46
|
+
#
|
47
|
+
# line = '1,2,3,4'
|
48
|
+
# x = ExampleCSV.new( line )
|
49
|
+
#
|
50
|
+
# assert x.responds_to? :jkl
|
51
|
+
# assert_equal x.abc, '1'
|
52
|
+
# assert_equal x.jkl.to_i, 4
|
53
|
+
#
|
54
|
+
module FileDefinitions
|
55
|
+
|
56
|
+
include Enumerable
|
57
|
+
|
58
|
+
attr_accessor :key
|
59
|
+
attr_accessor :current_line
|
60
|
+
|
61
|
+
# Set the delimiter to use when splitting a line - can be either a String, or a Regexp
|
62
|
+
attr_writer :field_delim
|
63
|
+
|
64
|
+
def initialize( line = nil )
|
65
|
+
@key = String.new
|
66
|
+
parse(line) unless line.nil?
|
67
|
+
end
|
68
|
+
|
69
|
+
def self.included(base)
|
70
|
+
base.extend(ClassMethods)
|
71
|
+
subclasses << base
|
72
|
+
end
|
73
|
+
|
74
|
+
def self.subclasses
|
75
|
+
@subclasses ||=[]
|
76
|
+
end
|
77
|
+
|
78
|
+
|
79
|
+
# Return the field delimiter used when splitting a line
|
80
|
+
def field_delim
|
81
|
+
@field_delim || ','
|
82
|
+
end
|
83
|
+
|
84
|
+
# Parse each line of a file based on the field definition, yields self for each successive line
|
85
|
+
#
|
86
|
+
def each( file )
|
87
|
+
File::new(file).each_line do |line|
|
88
|
+
parse( line )
|
89
|
+
yield self
|
90
|
+
end
|
91
|
+
end
|
92
|
+
|
93
|
+
def fields
|
94
|
+
@fields = self.class.field_definition.collect {|f| instance_variable_get "@#{f}" }
|
95
|
+
@fields
|
96
|
+
end
|
97
|
+
|
98
|
+
def to_s
|
99
|
+
fields.join(',')
|
100
|
+
end
|
101
|
+
|
102
|
+
module ClassMethods
|
103
|
+
|
104
|
+
# Helper to generate methods to store and return the complete list of fields
|
105
|
+
# in this File definition (also creates member @field_definition) and parse a line.
|
106
|
+
#
|
107
|
+
# e.g create_field_definition %w{ trade_id drOrCr ccy costCentre postingDate amount }
|
108
|
+
#
|
109
|
+
def create_field_definition( *fields )
|
110
|
+
instance_eval <<-end_eval
|
111
|
+
@field_definition ||= %w{ #{fields.join(' ')} }
|
112
|
+
def field_definition
|
113
|
+
@field_definition
|
114
|
+
end
|
115
|
+
end_eval
|
116
|
+
|
117
|
+
class_eval <<-end_eval
|
118
|
+
def parse( line )
|
119
|
+
@current_line = line
|
120
|
+
before_parse if respond_to? :before_parse
|
121
|
+
@current_line.split(field_delim()).each_with_index {|x, i| instance_variable_set(\"@\#{self.class.field_definition[i]}\", x) }
|
122
|
+
after_parse if respond_to? :after_parse
|
123
|
+
generate_key if respond_to? :generate_key
|
124
|
+
end
|
125
|
+
end_eval
|
126
|
+
end
|
127
|
+
|
128
|
+
def add_field(field, add_accessor = true)
|
129
|
+
@field_definition ||= []
|
130
|
+
@field_definition << field.to_s
|
131
|
+
attr_accessor field if(add_accessor)
|
132
|
+
end
|
133
|
+
|
134
|
+
|
135
|
+
# Helper to generate methods that return the complete list of fixed width fields
|
136
|
+
# and associated ranges in this File definition, and parse a line.
|
137
|
+
# e.g create_field_definition %w{ trade_id drOrCr ccy costCentre postingDate amount }
|
138
|
+
#
|
139
|
+
def create_fixed_definition( field_range_map )
|
140
|
+
raise ArgumentError.new('Please supply hash to create_fixed_definition') unless field_range_map.is_a? Hash
|
141
|
+
|
142
|
+
keys = field_range_map.keys.collect(&:to_s)
|
143
|
+
string_map = Hash[*keys.zip(field_range_map.values).flatten]
|
144
|
+
|
145
|
+
instance_eval <<-end_eval
|
146
|
+
def fixed_definition
|
147
|
+
@fixed_definition ||= #{string_map.inspect}
|
148
|
+
@fixed_definition
|
149
|
+
end
|
150
|
+
end_eval
|
151
|
+
|
152
|
+
instance_eval <<-end_eval
|
153
|
+
def field_definition
|
154
|
+
@field_definition ||= %w{ #{keys.join(' ')} }
|
155
|
+
@field_definition
|
156
|
+
end
|
157
|
+
end_eval
|
158
|
+
|
159
|
+
class_eval <<-end_eval
|
160
|
+
def parse( line )
|
161
|
+
@current_line = line
|
162
|
+
before_parse if respond_to? :before_parse
|
163
|
+
self.class.fixed_definition.each do |key, range|
|
164
|
+
instance_variable_set(\"@\#{key}\", @current_line[range])
|
165
|
+
end
|
166
|
+
after_parse if respond_to? :after_parse
|
167
|
+
generate_key if respond_to? :generate_key
|
168
|
+
end
|
169
|
+
end_eval
|
170
|
+
|
171
|
+
end
|
172
|
+
|
173
|
+
# Create accessors for each field
|
174
|
+
def create_field_attr_accessors
|
175
|
+
self.field_definition.each {|f| attr_accessor f}
|
176
|
+
end
|
177
|
+
|
178
|
+
|
179
|
+
###############################
|
180
|
+
# PARSING + FILE MANIPULATION #
|
181
|
+
###############################
|
182
|
+
|
183
|
+
# Parse a complete file and return array of self, one per line
|
184
|
+
def parse_file( file, options = {} )
|
185
|
+
limit = options[:limit]
|
186
|
+
count = 0
|
187
|
+
lines = []
|
188
|
+
File::new(file).each_line do |line|
|
189
|
+
break if limit && ((count += 1) > limit)
|
190
|
+
lines << self.new( line )
|
191
|
+
end
|
192
|
+
lines
|
193
|
+
end
|
194
|
+
|
195
|
+
|
196
|
+
|
197
|
+
# Split a file, whose field definition is represented by self,
|
198
|
+
# into seperate streams, based on the values of one if it's fields.
|
199
|
+
#
|
200
|
+
# Writes the results, one file per split stream, to directory specified by output_path
|
201
|
+
#
|
202
|
+
# Options:
|
203
|
+
#
|
204
|
+
# :keys => Also write split files of the key fields
|
205
|
+
#
|
206
|
+
# :filter => Optional Regular Expression to act as filter be applid to the field.
|
207
|
+
# For example split by Ccy but filter to only include certain ccys pass
|
208
|
+
# filter => '[GBP|USD]'
|
209
|
+
#
|
210
|
+
def split_on_write( file_name, field, output_path, options = {} )
|
211
|
+
|
212
|
+
path = output_path || '.'
|
213
|
+
|
214
|
+
filtered = split_on( file_name, field, options )
|
215
|
+
|
216
|
+
unless filtered.empty?
|
217
|
+
log :info, "Writing seperate streams to #{path}"
|
218
|
+
|
219
|
+
filtered.each { |strm, objects| RecsBase::write( {"keys_#{field}_#{strm}.csv" => objects.collect(&:key).join("\n")}, path) } if(options.key?(:keys))
|
220
|
+
|
221
|
+
filtered.each { |strm, objects| RecsBase::write( {"#{field}_#{strm}.csv" => objects.collect(&:current_line).join("\n")}, path) }
|
222
|
+
end
|
223
|
+
end
|
224
|
+
|
225
|
+
# Split a file, whose field definition is represented by self,
|
226
|
+
# into seperate streams, based on one if it's fields.
|
227
|
+
#
|
228
|
+
# Returns a map of Field value => File def object
|
229
|
+
#
|
230
|
+
# We return the File Def object as this is now enriched, e.g with key fields, compared to the raw file.
|
231
|
+
#
|
232
|
+
# Users can get at the raw line simply by calling the line() method on File Def object
|
233
|
+
#
|
234
|
+
# Options:
|
235
|
+
#
|
236
|
+
# :output_path => directory to write the individual streams files to
|
237
|
+
#
|
238
|
+
# :filter => Optional Regular Expression to act as filter be applid to the field.
|
239
|
+
# For example split by Ccy but filter to only include certain ccys pass
|
240
|
+
# filter => 'GBP|USD|EUR'
|
241
|
+
#
|
242
|
+
def split_on( file_name, field, options = {} )
|
243
|
+
|
244
|
+
regex = options[:filter] ? Regexp.new(options[:filter]) : nil
|
245
|
+
|
246
|
+
log :debug, "Using REGEX: #{regex.inspect}" if regex
|
247
|
+
|
248
|
+
filtered = {}
|
249
|
+
|
250
|
+
if( self.new.respond_to?(field) )
|
251
|
+
|
252
|
+
log :info, "Splitting on #{field}"
|
253
|
+
|
254
|
+
File.open( file_name ) do |t|
|
255
|
+
t.each do |line|
|
256
|
+
next unless(line && line.chomp!)
|
257
|
+
x = self.new(line)
|
258
|
+
|
259
|
+
value = x.send( field.to_sym ) # the actual field value from the specified field column
|
260
|
+
next if value.nil?
|
261
|
+
|
262
|
+
if( regex.nil? || value.match(regex) )
|
263
|
+
filtered[value] ? filtered[value] << x : filtered[value] = [x]
|
264
|
+
end
|
265
|
+
end
|
266
|
+
end
|
267
|
+
else
|
268
|
+
log :warn, "Field [#{field}] nor defined for file definition #{self.class.name}"
|
269
|
+
end
|
270
|
+
|
271
|
+
if( options[:sort])
|
272
|
+
filtered.values.each( &:sort )
|
273
|
+
return filtered
|
274
|
+
end
|
275
|
+
return filtered
|
276
|
+
end
|
277
|
+
|
278
|
+
# Open and parse a file, replacing a value in the specfied field.
|
279
|
+
# Does not update the file itself. Does not write a new output file.
|
280
|
+
#
|
281
|
+
# Returns :
|
282
|
+
# 1) full collection of updated lines
|
283
|
+
# 2) collection of file def objects (self), with updated value.
|
284
|
+
#
|
285
|
+
# Finds values matching old_value in given map
|
286
|
+
#
|
287
|
+
# Replaces matches with new_value in map.
|
288
|
+
#
|
289
|
+
# Accepts more than one field, if files is either and array of strings
|
290
|
+
# or comma seperated list of fields.
|
291
|
+
#
|
292
|
+
def file_set_field_by_map( file_name, fields, value_map, regex = nil )
|
293
|
+
|
294
|
+
lines, objects = [],[]
|
295
|
+
|
296
|
+
if fields.is_a?(Array)
|
297
|
+
attribs = fields
|
298
|
+
else
|
299
|
+
attribs = "#{fields}".split(',')
|
300
|
+
end
|
301
|
+
|
302
|
+
attribs.collect! do |attrib|
|
303
|
+
raise BadConfigError.new("Field: #{attrib} is not a field on #{self.class.name}") unless self.new.respond_to?(attrib)
|
304
|
+
end
|
305
|
+
|
306
|
+
log :info, "#{self.class.name} - updating field(s) #{fields} in #{file_name}"
|
307
|
+
|
308
|
+
File.open( file_name ) do |t|
|
309
|
+
t.each do |line|
|
310
|
+
if line.chomp.empty?
|
311
|
+
lines << line
|
312
|
+
objects << self.new
|
313
|
+
next
|
314
|
+
end
|
315
|
+
x = self.new(line)
|
316
|
+
|
317
|
+
attribs.each do |a|
|
318
|
+
old_value = x.instance_variable_get( "@#{a}" )
|
319
|
+
x.instance_variable_set( "@#{a}", value_map[old_value] ) if value_map[old_value] || (regex && old_value.keys.detect {|k| k.match(regx) })
|
320
|
+
end
|
321
|
+
|
322
|
+
objects << x
|
323
|
+
lines << x.to_s
|
324
|
+
end
|
325
|
+
end
|
326
|
+
|
327
|
+
return lines, objects
|
328
|
+
end
|
329
|
+
end # END class methods
|
330
|
+
|
331
|
+
# Open and parse a file, replacing a value in the specfied field.
|
332
|
+
# Does not update the file itself. Does not write a new output file.
|
333
|
+
#
|
334
|
+
# Returns :
|
335
|
+
# 1) full collection of updated lines
|
336
|
+
# 2) collection of file def objects (self), with updated value.
|
337
|
+
#
|
338
|
+
# Finds values matching old_value, and also accepts an optional regex for more powerful
|
339
|
+
# matching strategies of values on the specfified field.
|
340
|
+
#
|
341
|
+
# Replaces matches with new_value.
|
342
|
+
#
|
343
|
+
# Accepts more than one field, if files is either and array of strings
|
344
|
+
# or comma seperated list of fields.
|
345
|
+
#
|
346
|
+
def file_set_field( file_name, field, old_value, new_value, regex = nil )
|
347
|
+
|
348
|
+
map = {old_value => new_value}
|
349
|
+
|
350
|
+
return file_set_field_by_map(file_name, field, map, regex)
|
351
|
+
end
|
352
|
+
|
353
|
+
end
|