ar_loader 0.0.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/LICENSE +9 -0
- data/README.markdown +211 -0
- data/Rakefile +76 -0
- data/lib/VERSION +1 -0
- data/lib/ar_loader.rb +53 -0
- data/lib/engine/file_definitions.rb +353 -0
- data/lib/engine/jruby/jexcel_file.rb +182 -0
- data/lib/engine/jruby/method_mapper_excel.rb +44 -0
- data/lib/engine/mapping_file_definitions.rb +88 -0
- data/lib/engine/method_detail.rb +139 -0
- data/lib/engine/method_mapper.rb +157 -0
- data/lib/engine/method_mapper_csv.rb +28 -0
- data/lib/engine/word.rb +70 -0
- data/lib/java/poi-3.2-FINAL-20081019.jar +0 -0
- data/lib/java/poi-3.6.jar +0 -0
- data/lib/java/poi-contrib-3.6-20091214.jar +0 -0
- data/lib/java/poi-examples-3.6-20091214.jar +0 -0
- data/lib/java/poi-ooxml-3.6-20091214.jar +0 -0
- data/lib/java/poi-ooxml-schemas-3.6-20091214.jar +0 -0
- data/lib/java/poi-scratchpad-3.6-20091214.jar +0 -0
- data/lib/loaders/loader_base.rb +61 -0
- data/lib/loaders/spree/image_loader.rb +47 -0
- data/lib/loaders/spree/product_loader.rb +93 -0
- data/lib/to_b.rb +24 -0
- data/spec/excel_loader_spec.rb +138 -0
- data/spec/spec_helper.rb +37 -0
- data/tasks/db_tasks.rake +65 -0
- data/tasks/excel_loader.rake +101 -0
- data/tasks/file_tasks.rake +38 -0
- data/tasks/seed_fu_product_template.erb +15 -0
- data/tasks/spree/image_load.rake +103 -0
- data/tasks/spree/product_loader.rake +107 -0
- data/tasks/tidy_config.txt +13 -0
- data/tasks/word_to_seedfu.rake +167 -0
- metadata +90 -0
data/LICENSE
ADDED
data/README.markdown
ADDED
@@ -0,0 +1,211 @@
|
|
1
|
+
# AR Loader
|
2
|
+
|
3
|
+
General Active Record Loader with current focus on support for Spree.
|
4
|
+
|
5
|
+
Maps column headings to attributes and associations.
|
6
|
+
|
7
|
+
Fully extendable via spreadsheet headings - simply add new column to Excel with
|
8
|
+
attribute or association name, and loader will attempt to
|
9
|
+
find correct association and populate AR object with row data.
|
10
|
+
|
11
|
+
Can handle human read-able forms of column names. For example, given an association on AR model called,
|
12
|
+
product_properties, will map from column headings such as 'product_properties',
|
13
|
+
'Product Properties', 'product properties' etc
|
14
|
+
|
15
|
+
## Installation
|
16
|
+
|
17
|
+
Add gem instruction to your Gemfile. To use the Excel loader, JRuby is required, so to use in a mixed setup
|
18
|
+
of JRuby and deployed to other Rubies, use the following guard.
|
19
|
+
|
20
|
+
if(RUBY_PLATFORM =~ /java/)
|
21
|
+
gem 'activerecord-jdbcmysql-adapter'
|
22
|
+
else
|
23
|
+
gem 'mysql'
|
24
|
+
end
|
25
|
+
|
26
|
+
Currently not tested AR usage outside a Rails Project but to install the l;atest gem :
|
27
|
+
|
28
|
+
`gem install ar_loader`
|
29
|
+
|
30
|
+
To pull the tasks in, add call in your Rakefile to :
|
31
|
+
|
32
|
+
ArLoader::require_tasks
|
33
|
+
|
34
|
+
## Example Spreadsheet
|
35
|
+
|
36
|
+
An example Spreadsheet with headers and comments, suitable for giving to Clients
|
37
|
+
to populate, can be found in test/examples/DemoSpreadsheet.xls
|
38
|
+
|
39
|
+
## Features
|
40
|
+
|
41
|
+
- *Direct Excel support*
|
42
|
+
|
43
|
+
Includes a wrapper around MS Excel via Apache POI, which
|
44
|
+
enables Products to be loaded directly from Excel via JRuby. No need to save to CSV first.
|
45
|
+
|
46
|
+
The java jars e.g - 'poi-3.6.jar' - are included.
|
47
|
+
|
48
|
+
- *Semi-Smart Name Lookup*
|
49
|
+
|
50
|
+
Includes helper classes that find and store details of all possible associations on an AR class.
|
51
|
+
Given a user supplied name, attempts to find the requested association.
|
52
|
+
|
53
|
+
Example usage, load from a file or spreadsheet where the column names are only
|
54
|
+
an approximation of the actual associations, so given 'Product Properties' heading,
|
55
|
+
finds real association 'product_properties' to send or call on the AR object
|
56
|
+
|
57
|
+
- *Associations*
|
58
|
+
|
59
|
+
Can handle 'many' type associations and enables multiple association objects to
|
60
|
+
be added via single entry (column). See Details section.
|
61
|
+
|
62
|
+
- *Spree Rake Tasks*
|
63
|
+
|
64
|
+
Rake tasks provided for Spree loading - currently supports Product with associations,
|
65
|
+
and Image loading.
|
66
|
+
|
67
|
+
**Product loading from Excel specifically requires JRuby**.
|
68
|
+
|
69
|
+
Example command lines:
|
70
|
+
|
71
|
+
rake excel_load input=vendor\extensions\autotelik\fixtures\ExampleInfoWeb.xls
|
72
|
+
|
73
|
+
rake excel_load input=C:\MyProducts.xls verbose=true'
|
74
|
+
|
75
|
+
- *Seamless Image loading can be achieved by ensuring SKU or class Name features in Image filename.
|
76
|
+
|
77
|
+
Lookup is performed either via the SKU being prepended to the image name, or by the image name being equal to the **name attribute** of the klass in question.
|
78
|
+
|
79
|
+
Images can be attached to any class defined with a suitable association. The class to use can be configured in rake task via
|
80
|
+
parameter klass=Xyz.
|
81
|
+
|
82
|
+
In the Spree tasks, this defaults to Product, so attempts to attach Image to a Product via Product SKU or Name.
|
83
|
+
|
84
|
+
Image loading **does not** specifically require JRuby
|
85
|
+
|
86
|
+
A report is generated in the current working directory detailing any Images in the paths that could not be matched with a Product.
|
87
|
+
|
88
|
+
Example cmd lines :
|
89
|
+
|
90
|
+
rake image_load input=vendor\extensions\autotelik\lib\fixtures
|
91
|
+
rake image_load input="C:\images\Paintings' dummy=true
|
92
|
+
rake image_load input="C:\images\TaxonIcons" skip_if_no_assoc=true klass=Taxon
|
93
|
+
|
94
|
+
## Example Wrapper Tasks for Spree Site Extension
|
95
|
+
|
96
|
+
These tasks show how to write your own high level wrapper task, that will seed the database from multiple spreedsheets.
|
97
|
+
|
98
|
+
The images in this example have been named with the SKU present in name (separated by whitespace) e.g "PRINT_001 Stonehenge.jpg"
|
99
|
+
|
100
|
+
A report is generated in the current working directory detailing any Images in the paths that could not be matched with a Product.
|
101
|
+
|
102
|
+
require 'ar_loader'
|
103
|
+
|
104
|
+
namespace :mysite do
|
105
|
+
|
106
|
+
desc "Load Products for site"
|
107
|
+
task :load, :needs => [:environment] do |t, args|
|
108
|
+
|
109
|
+
[ "vendor/extensions/site/db/seed/Paintings.xls",
|
110
|
+
"vendor/extensions/site/db/seed/Drawings.xls"
|
111
|
+
].each do |x|
|
112
|
+
Rake::Task['autotelik:excel_load'].execute(
|
113
|
+
:input => x,
|
114
|
+
:verbose => true,
|
115
|
+
:sku_prefix => ""
|
116
|
+
)
|
117
|
+
end
|
118
|
+
end
|
119
|
+
|
120
|
+
desc "Load Images for site based on SKU"
|
121
|
+
task :load_images, :clean, :dummy, :needs => [:environment] do |t, args|
|
122
|
+
|
123
|
+
if(args[:clean])
|
124
|
+
Image.delete_all
|
125
|
+
FileUtils.rm_rf( "public/assests/products" )
|
126
|
+
end
|
127
|
+
|
128
|
+
["01_paintings_jpegs", "02_drawings_jpegs"].each do |x|
|
129
|
+
|
130
|
+
# image names start with associated Product SKU,
|
131
|
+
# skip rather then exit if no matching product found
|
132
|
+
|
133
|
+
Rake::Task['autotelik:image_load'].execute(
|
134
|
+
:input => "/my_site_load_info//#{x}",
|
135
|
+
:dummy => args[:dummy],
|
136
|
+
:verbose => false, :sku => true, :skip_if_no_assoc => true
|
137
|
+
)
|
138
|
+
end
|
139
|
+
end
|
140
|
+
|
141
|
+
## Details
|
142
|
+
|
143
|
+
### Associations
|
144
|
+
|
145
|
+
A single association column can contain multiple name/value sets in default form :
|
146
|
+
|
147
|
+
Name1:value1, value2|Name2:value1, value2, value3|Name3:value1, value2 etc
|
148
|
+
|
149
|
+
So for example a Column for an 'Option Types' association on a Product,
|
150
|
+
could contain 2 options with a number of values each :
|
151
|
+
|
152
|
+
'Option Types'
|
153
|
+
size:small,medium,large|colour:red,white
|
154
|
+
size:small|colour:blue,red,white
|
155
|
+
|
156
|
+
### Properties
|
157
|
+
|
158
|
+
The properties to associate with this product.
|
159
|
+
Properties are for small snippets of text, shared across many products,
|
160
|
+
and are for display purposes only.
|
161
|
+
|
162
|
+
An optional display value can be supplied to supplement the displayed text.
|
163
|
+
|
164
|
+
As for all associations can contain multiple name/value sets in default form :
|
165
|
+
|
166
|
+
Property:display_value|Property:display_value
|
167
|
+
|
168
|
+
Example - No values :
|
169
|
+
manufacturer|standard
|
170
|
+
|
171
|
+
Example - Display values :
|
172
|
+
manufacturer:somebody else plc|standard:ISOBlah21
|
173
|
+
|
174
|
+
## TODO
|
175
|
+
|
176
|
+
- Make more generic, so have smart switching to Ruby and directly support csv,
|
177
|
+
when JRuby and/or Excel not available.
|
178
|
+
|
179
|
+
- Smart sorting of column processing order ....
|
180
|
+
|
181
|
+
Does not currently ensure mandatory columns (for valid?) processed first.
|
182
|
+
Since Product needs saving before associations can be processed, user currently
|
183
|
+
needs to ensure SKU, name, price columns are among first columns
|
184
|
+
|
185
|
+
## License
|
186
|
+
|
187
|
+
Copyright:: (c) Autotelik Media Ltd 2011
|
188
|
+
|
189
|
+
Author :: Tom Statter
|
190
|
+
|
191
|
+
Date :: Feb 2011
|
192
|
+
|
193
|
+
The MIT License
|
194
|
+
|
195
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
196
|
+
of this software and associated documentation files (the "Software"), to deal
|
197
|
+
in the Software without restriction, including without limitation the rights
|
198
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
199
|
+
copies of the Software, and to permit persons to whom the Software is
|
200
|
+
furnished to do so, subject to the following conditions:
|
201
|
+
|
202
|
+
The above copyright notice and this permission notice shall be included in
|
203
|
+
all copies or substantial portions of the Software.
|
204
|
+
|
205
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
206
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
207
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
208
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
209
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
210
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
211
|
+
THE SOFTWARE.
|
data/Rakefile
ADDED
@@ -0,0 +1,76 @@
|
|
1
|
+
require 'rubygems'
|
2
|
+
require 'rake'
|
3
|
+
require 'rake/clean'
|
4
|
+
require 'rake/gempackagetask'
|
5
|
+
require 'rake/rdoctask'
|
6
|
+
require 'rake/testtask'
|
7
|
+
require "lib/ar_loader"
|
8
|
+
|
9
|
+
# Copyright:: (c) Autotelik Media Ltd 2011
|
10
|
+
# Author :: Tom Statter
|
11
|
+
# Date :: Aug 2010
|
12
|
+
#
|
13
|
+
# License:: MIT - Free, OpenSource
|
14
|
+
#
|
15
|
+
# Details:: Gem::Specification for Active Record Loader gem.
|
16
|
+
#
|
17
|
+
# Specifically enabled for uploading Spree products but easily
|
18
|
+
# extended to any AR model.
|
19
|
+
#
|
20
|
+
# Currently support direct access to Excel Spreedsheets via JRuby
|
21
|
+
#
|
22
|
+
# TODO - Switch for non JRuby Rubies, enable load via CSV file instead of Excel.
|
23
|
+
#
|
24
|
+
ArLoader::require_tasks
|
25
|
+
|
26
|
+
spec = Gem::Specification.new do |s|
|
27
|
+
s.name = ArLoader.gem_name
|
28
|
+
s.version = ArLoader.gem_version
|
29
|
+
s.has_rdoc = true
|
30
|
+
s.extra_rdoc_files = ['README.markdown', 'LICENSE']
|
31
|
+
s.summary = 'File based loader for Active Record models'
|
32
|
+
s.description = 'A file based loader for Active Record models. Seed database directly from Excel/CSV. Includes rake support for Spree'
|
33
|
+
s.author = 'thomas statter'
|
34
|
+
s.email = 'rubygems@autotelik.co.uk'
|
35
|
+
s.date = DateTime.now.strftime("%Y-%m-%d")
|
36
|
+
s.homepage = %q{http://www.autotelik.co.uk}
|
37
|
+
|
38
|
+
# s.executables = ['your_executable_here']
|
39
|
+
s.files = %w(LICENSE README.markdown Rakefile) + Dir.glob("{lib,spec,tasks}/**/*")
|
40
|
+
s.require_path = "lib"
|
41
|
+
s.bindir = "bin"
|
42
|
+
end
|
43
|
+
|
44
|
+
Rake::GemPackageTask.new(spec) do |p|
|
45
|
+
p.gem_spec = spec
|
46
|
+
p.need_tar = true
|
47
|
+
p.need_zip = true
|
48
|
+
end
|
49
|
+
|
50
|
+
Rake::RDocTask.new do |rdoc|
|
51
|
+
files =['README.markdown', 'LICENSE', 'lib/**/*.rb']
|
52
|
+
rdoc.rdoc_files.add(files)
|
53
|
+
rdoc.main = "README.markdown" # page to start on
|
54
|
+
rdoc.title = "ARLoader Docs"
|
55
|
+
rdoc.rdoc_dir = 'doc/rdoc' # rdoc output folder
|
56
|
+
rdoc.options << '--line-numbers'
|
57
|
+
end
|
58
|
+
|
59
|
+
Rake::TestTask.new do |t|
|
60
|
+
t.test_files = FileList['test/**/*.rb']
|
61
|
+
end
|
62
|
+
|
63
|
+
# Add in our own Tasks
|
64
|
+
|
65
|
+
# Long parameter lists so ensure rake -T produces nice wide output
|
66
|
+
ENV['RAKE_COLUMNS'] = '180'
|
67
|
+
|
68
|
+
desc 'Build gem and install in one step'
|
69
|
+
task :pik_install, :needs => [:gem] do |t, args|
|
70
|
+
|
71
|
+
puts "Installing version #{ArLoader.gem_version}"
|
72
|
+
|
73
|
+
gem = "#{ArLoader.gem_name}-#{ArLoader.gem_version}.gem"
|
74
|
+
cmd = "pik gem install --no-ri --no-rdoc pkg\\#{gem}"
|
75
|
+
system(cmd)
|
76
|
+
end
|
data/lib/VERSION
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
0.0.4
|
data/lib/ar_loader.rb
ADDED
@@ -0,0 +1,53 @@
|
|
1
|
+
# Copyright:: (c) Autotelik Media Ltd 2011
|
2
|
+
# Author :: Tom Statter
|
3
|
+
# Date :: Aug 2010
|
4
|
+
# License:: TBD. Free, Open Source. MIT ?
|
5
|
+
#
|
6
|
+
# Details:: Active Record Loader
|
7
|
+
#
|
8
|
+
module ArLoader
|
9
|
+
|
10
|
+
def self.gem_version
|
11
|
+
@gem_version ||= File.read( File.join( root_path, 'lib', 'VERSION') ).chomp
|
12
|
+
@gem_version
|
13
|
+
end
|
14
|
+
|
15
|
+
def self.gem_name
|
16
|
+
"ar_loader"
|
17
|
+
end
|
18
|
+
|
19
|
+
def self.root_path
|
20
|
+
File.expand_path("#{File.dirname(__FILE__)}/..")
|
21
|
+
end
|
22
|
+
|
23
|
+
def self.require_libraries
|
24
|
+
|
25
|
+
loader_libs = %w{ lib }
|
26
|
+
|
27
|
+
# Base search paths - these will be searched recursively and any xxx.rake files autoimported
|
28
|
+
loader_paths = []
|
29
|
+
|
30
|
+
loader_libs.each {|lib| loader_paths << File.join(root_path, lib) }
|
31
|
+
|
32
|
+
# Define require search paths, any dir in here will be added to LOAD_PATH
|
33
|
+
|
34
|
+
loader_paths.each do |base|
|
35
|
+
$:.unshift base if File.directory?(base)
|
36
|
+
Dir[File.join(base, '**', '**')].each do |p|
|
37
|
+
if File.directory? p
|
38
|
+
$:.unshift p
|
39
|
+
end
|
40
|
+
end
|
41
|
+
end
|
42
|
+
end
|
43
|
+
|
44
|
+
def self.require_tasks
|
45
|
+
# Long parameter lists so ensure rake -T produces nice wide output
|
46
|
+
ENV['RAKE_COLUMNS'] = '180'
|
47
|
+
base = File.join(root_path, 'tasks', '**')
|
48
|
+
Dir["#{base}/*.rake"].sort.each { |ext| load ext }
|
49
|
+
end
|
50
|
+
|
51
|
+
end
|
52
|
+
|
53
|
+
ArLoader::require_libraries
|
@@ -0,0 +1,353 @@
|
|
1
|
+
# Copyright:: (c) Autotelik Media Ltd 2011
|
2
|
+
# Author :: Tom Statter
|
3
|
+
# Date :: Jan 2011
|
4
|
+
# License:: MIT
|
5
|
+
#
|
6
|
+
# Details:: This module acts as helpers for defining input/output file formats as classes.
|
7
|
+
#
|
8
|
+
# It provides a simple interface to define a file structure - field by field.
|
9
|
+
#
|
10
|
+
# By defining the structure, following methods and attributes are mixed in :
|
11
|
+
#
|
12
|
+
# An attribute, with accessor for each field/column.
|
13
|
+
# Parse a line, assigning values to each attribute.
|
14
|
+
# Parse an instance of that file line by line, accepts a block in which data can be processed.
|
15
|
+
# Method to split a file by field.
|
16
|
+
# Method to perform replace operations on a file by field and value.
|
17
|
+
#
|
18
|
+
# Either delimited or a fixed width definition can be created via macro-like class methods :
|
19
|
+
#
|
20
|
+
# create_field_definition [field_list]
|
21
|
+
#
|
22
|
+
# create_fixed_definition {field => range }
|
23
|
+
#
|
24
|
+
# Member attributes, with getters and setters, can be added for each field defined above via class method :
|
25
|
+
#
|
26
|
+
# create_field_attr_accessors
|
27
|
+
#
|
28
|
+
# USAGE :
|
29
|
+
#
|
30
|
+
# Create a class that contains definition of a file.
|
31
|
+
#
|
32
|
+
# class ExampleFixedWith < FileDefinitionBase
|
33
|
+
# create_fixed_definition(:name => (0..7), :value => (8..15), :ccy => (16..18), :dr_or_cr => (19..19) )
|
34
|
+
#
|
35
|
+
# create_field_attr_accessors
|
36
|
+
# end
|
37
|
+
#
|
38
|
+
# class ExampleCSV < FileDefinitionBase
|
39
|
+
# create_field_definition %w{abc def ghi jkl}
|
40
|
+
#
|
41
|
+
# create_field_attr_accessors
|
42
|
+
# end
|
43
|
+
#
|
44
|
+
# Any instance can then be used to parse the defined file type, with each field or column value
|
45
|
+
# being assigned automatically to the associated instance variable.
|
46
|
+
#
|
47
|
+
# line = '1,2,3,4'
|
48
|
+
# x = ExampleCSV.new( line )
|
49
|
+
#
|
50
|
+
# assert x.responds_to? :jkl
|
51
|
+
# assert_equal x.abc, '1'
|
52
|
+
# assert_equal x.jkl.to_i, 4
|
53
|
+
#
|
54
|
+
module FileDefinitions
|
55
|
+
|
56
|
+
include Enumerable
|
57
|
+
|
58
|
+
attr_accessor :key
|
59
|
+
attr_accessor :current_line
|
60
|
+
|
61
|
+
# Set the delimiter to use when splitting a line - can be either a String, or a Regexp
|
62
|
+
attr_writer :field_delim
|
63
|
+
|
64
|
+
def initialize( line = nil )
|
65
|
+
@key = String.new
|
66
|
+
parse(line) unless line.nil?
|
67
|
+
end
|
68
|
+
|
69
|
+
def self.included(base)
|
70
|
+
base.extend(ClassMethods)
|
71
|
+
subclasses << base
|
72
|
+
end
|
73
|
+
|
74
|
+
def self.subclasses
|
75
|
+
@subclasses ||=[]
|
76
|
+
end
|
77
|
+
|
78
|
+
|
79
|
+
# Return the field delimiter used when splitting a line
|
80
|
+
def field_delim
|
81
|
+
@field_delim || ','
|
82
|
+
end
|
83
|
+
|
84
|
+
# Parse each line of a file based on the field definition, yields self for each successive line
|
85
|
+
#
|
86
|
+
def each( file )
|
87
|
+
File::new(file).each_line do |line|
|
88
|
+
parse( line )
|
89
|
+
yield self
|
90
|
+
end
|
91
|
+
end
|
92
|
+
|
93
|
+
def fields
|
94
|
+
@fields = self.class.field_definition.collect {|f| instance_variable_get "@#{f}" }
|
95
|
+
@fields
|
96
|
+
end
|
97
|
+
|
98
|
+
def to_s
|
99
|
+
fields.join(',')
|
100
|
+
end
|
101
|
+
|
102
|
+
module ClassMethods
|
103
|
+
|
104
|
+
# Helper to generate methods to store and return the complete list of fields
|
105
|
+
# in this File definition (also creates member @field_definition) and parse a line.
|
106
|
+
#
|
107
|
+
# e.g create_field_definition %w{ trade_id drOrCr ccy costCentre postingDate amount }
|
108
|
+
#
|
109
|
+
def create_field_definition( *fields )
|
110
|
+
instance_eval <<-end_eval
|
111
|
+
@field_definition ||= %w{ #{fields.join(' ')} }
|
112
|
+
def field_definition
|
113
|
+
@field_definition
|
114
|
+
end
|
115
|
+
end_eval
|
116
|
+
|
117
|
+
class_eval <<-end_eval
|
118
|
+
def parse( line )
|
119
|
+
@current_line = line
|
120
|
+
before_parse if respond_to? :before_parse
|
121
|
+
@current_line.split(field_delim()).each_with_index {|x, i| instance_variable_set(\"@\#{self.class.field_definition[i]}\", x) }
|
122
|
+
after_parse if respond_to? :after_parse
|
123
|
+
generate_key if respond_to? :generate_key
|
124
|
+
end
|
125
|
+
end_eval
|
126
|
+
end
|
127
|
+
|
128
|
+
def add_field(field, add_accessor = true)
|
129
|
+
@field_definition ||= []
|
130
|
+
@field_definition << field.to_s
|
131
|
+
attr_accessor field if(add_accessor)
|
132
|
+
end
|
133
|
+
|
134
|
+
|
135
|
+
# Helper to generate methods that return the complete list of fixed width fields
|
136
|
+
# and associated ranges in this File definition, and parse a line.
|
137
|
+
# e.g create_field_definition %w{ trade_id drOrCr ccy costCentre postingDate amount }
|
138
|
+
#
|
139
|
+
def create_fixed_definition( field_range_map )
|
140
|
+
raise ArgumentError.new('Please supply hash to create_fixed_definition') unless field_range_map.is_a? Hash
|
141
|
+
|
142
|
+
keys = field_range_map.keys.collect(&:to_s)
|
143
|
+
string_map = Hash[*keys.zip(field_range_map.values).flatten]
|
144
|
+
|
145
|
+
instance_eval <<-end_eval
|
146
|
+
def fixed_definition
|
147
|
+
@fixed_definition ||= #{string_map.inspect}
|
148
|
+
@fixed_definition
|
149
|
+
end
|
150
|
+
end_eval
|
151
|
+
|
152
|
+
instance_eval <<-end_eval
|
153
|
+
def field_definition
|
154
|
+
@field_definition ||= %w{ #{keys.join(' ')} }
|
155
|
+
@field_definition
|
156
|
+
end
|
157
|
+
end_eval
|
158
|
+
|
159
|
+
class_eval <<-end_eval
|
160
|
+
def parse( line )
|
161
|
+
@current_line = line
|
162
|
+
before_parse if respond_to? :before_parse
|
163
|
+
self.class.fixed_definition.each do |key, range|
|
164
|
+
instance_variable_set(\"@\#{key}\", @current_line[range])
|
165
|
+
end
|
166
|
+
after_parse if respond_to? :after_parse
|
167
|
+
generate_key if respond_to? :generate_key
|
168
|
+
end
|
169
|
+
end_eval
|
170
|
+
|
171
|
+
end
|
172
|
+
|
173
|
+
# Create accessors for each field
|
174
|
+
def create_field_attr_accessors
|
175
|
+
self.field_definition.each {|f| attr_accessor f}
|
176
|
+
end
|
177
|
+
|
178
|
+
|
179
|
+
###############################
|
180
|
+
# PARSING + FILE MANIPULATION #
|
181
|
+
###############################
|
182
|
+
|
183
|
+
# Parse a complete file and return array of self, one per line
|
184
|
+
def parse_file( file, options = {} )
|
185
|
+
limit = options[:limit]
|
186
|
+
count = 0
|
187
|
+
lines = []
|
188
|
+
File::new(file).each_line do |line|
|
189
|
+
break if limit && ((count += 1) > limit)
|
190
|
+
lines << self.new( line )
|
191
|
+
end
|
192
|
+
lines
|
193
|
+
end
|
194
|
+
|
195
|
+
|
196
|
+
|
197
|
+
# Split a file, whose field definition is represented by self,
|
198
|
+
# into seperate streams, based on the values of one if it's fields.
|
199
|
+
#
|
200
|
+
# Writes the results, one file per split stream, to directory specified by output_path
|
201
|
+
#
|
202
|
+
# Options:
|
203
|
+
#
|
204
|
+
# :keys => Also write split files of the key fields
|
205
|
+
#
|
206
|
+
# :filter => Optional Regular Expression to act as filter be applid to the field.
|
207
|
+
# For example split by Ccy but filter to only include certain ccys pass
|
208
|
+
# filter => '[GBP|USD]'
|
209
|
+
#
|
210
|
+
def split_on_write( file_name, field, output_path, options = {} )
|
211
|
+
|
212
|
+
path = output_path || '.'
|
213
|
+
|
214
|
+
filtered = split_on( file_name, field, options )
|
215
|
+
|
216
|
+
unless filtered.empty?
|
217
|
+
log :info, "Writing seperate streams to #{path}"
|
218
|
+
|
219
|
+
filtered.each { |strm, objects| RecsBase::write( {"keys_#{field}_#{strm}.csv" => objects.collect(&:key).join("\n")}, path) } if(options.key?(:keys))
|
220
|
+
|
221
|
+
filtered.each { |strm, objects| RecsBase::write( {"#{field}_#{strm}.csv" => objects.collect(&:current_line).join("\n")}, path) }
|
222
|
+
end
|
223
|
+
end
|
224
|
+
|
225
|
+
# Split a file, whose field definition is represented by self,
|
226
|
+
# into seperate streams, based on one if it's fields.
|
227
|
+
#
|
228
|
+
# Returns a map of Field value => File def object
|
229
|
+
#
|
230
|
+
# We return the File Def object as this is now enriched, e.g with key fields, compared to the raw file.
|
231
|
+
#
|
232
|
+
# Users can get at the raw line simply by calling the line() method on File Def object
|
233
|
+
#
|
234
|
+
# Options:
|
235
|
+
#
|
236
|
+
# :output_path => directory to write the individual streams files to
|
237
|
+
#
|
238
|
+
# :filter => Optional Regular Expression to act as filter be applid to the field.
|
239
|
+
# For example split by Ccy but filter to only include certain ccys pass
|
240
|
+
# filter => 'GBP|USD|EUR'
|
241
|
+
#
|
242
|
+
def split_on( file_name, field, options = {} )
|
243
|
+
|
244
|
+
regex = options[:filter] ? Regexp.new(options[:filter]) : nil
|
245
|
+
|
246
|
+
log :debug, "Using REGEX: #{regex.inspect}" if regex
|
247
|
+
|
248
|
+
filtered = {}
|
249
|
+
|
250
|
+
if( self.new.respond_to?(field) )
|
251
|
+
|
252
|
+
log :info, "Splitting on #{field}"
|
253
|
+
|
254
|
+
File.open( file_name ) do |t|
|
255
|
+
t.each do |line|
|
256
|
+
next unless(line && line.chomp!)
|
257
|
+
x = self.new(line)
|
258
|
+
|
259
|
+
value = x.send( field.to_sym ) # the actual field value from the specified field column
|
260
|
+
next if value.nil?
|
261
|
+
|
262
|
+
if( regex.nil? || value.match(regex) )
|
263
|
+
filtered[value] ? filtered[value] << x : filtered[value] = [x]
|
264
|
+
end
|
265
|
+
end
|
266
|
+
end
|
267
|
+
else
|
268
|
+
log :warn, "Field [#{field}] nor defined for file definition #{self.class.name}"
|
269
|
+
end
|
270
|
+
|
271
|
+
if( options[:sort])
|
272
|
+
filtered.values.each( &:sort )
|
273
|
+
return filtered
|
274
|
+
end
|
275
|
+
return filtered
|
276
|
+
end
|
277
|
+
|
278
|
+
# Open and parse a file, replacing a value in the specfied field.
|
279
|
+
# Does not update the file itself. Does not write a new output file.
|
280
|
+
#
|
281
|
+
# Returns :
|
282
|
+
# 1) full collection of updated lines
|
283
|
+
# 2) collection of file def objects (self), with updated value.
|
284
|
+
#
|
285
|
+
# Finds values matching old_value in given map
|
286
|
+
#
|
287
|
+
# Replaces matches with new_value in map.
|
288
|
+
#
|
289
|
+
# Accepts more than one field, if files is either and array of strings
|
290
|
+
# or comma seperated list of fields.
|
291
|
+
#
|
292
|
+
def file_set_field_by_map( file_name, fields, value_map, regex = nil )
|
293
|
+
|
294
|
+
lines, objects = [],[]
|
295
|
+
|
296
|
+
if fields.is_a?(Array)
|
297
|
+
attribs = fields
|
298
|
+
else
|
299
|
+
attribs = "#{fields}".split(',')
|
300
|
+
end
|
301
|
+
|
302
|
+
attribs.collect! do |attrib|
|
303
|
+
raise BadConfigError.new("Field: #{attrib} is not a field on #{self.class.name}") unless self.new.respond_to?(attrib)
|
304
|
+
end
|
305
|
+
|
306
|
+
log :info, "#{self.class.name} - updating field(s) #{fields} in #{file_name}"
|
307
|
+
|
308
|
+
File.open( file_name ) do |t|
|
309
|
+
t.each do |line|
|
310
|
+
if line.chomp.empty?
|
311
|
+
lines << line
|
312
|
+
objects << self.new
|
313
|
+
next
|
314
|
+
end
|
315
|
+
x = self.new(line)
|
316
|
+
|
317
|
+
attribs.each do |a|
|
318
|
+
old_value = x.instance_variable_get( "@#{a}" )
|
319
|
+
x.instance_variable_set( "@#{a}", value_map[old_value] ) if value_map[old_value] || (regex && old_value.keys.detect {|k| k.match(regx) })
|
320
|
+
end
|
321
|
+
|
322
|
+
objects << x
|
323
|
+
lines << x.to_s
|
324
|
+
end
|
325
|
+
end
|
326
|
+
|
327
|
+
return lines, objects
|
328
|
+
end
|
329
|
+
end # END class methods
|
330
|
+
|
331
|
+
# Open and parse a file, replacing a value in the specfied field.
|
332
|
+
# Does not update the file itself. Does not write a new output file.
|
333
|
+
#
|
334
|
+
# Returns :
|
335
|
+
# 1) full collection of updated lines
|
336
|
+
# 2) collection of file def objects (self), with updated value.
|
337
|
+
#
|
338
|
+
# Finds values matching old_value, and also accepts an optional regex for more powerful
|
339
|
+
# matching strategies of values on the specfified field.
|
340
|
+
#
|
341
|
+
# Replaces matches with new_value.
|
342
|
+
#
|
343
|
+
# Accepts more than one field, if files is either and array of strings
|
344
|
+
# or comma seperated list of fields.
|
345
|
+
#
|
346
|
+
def file_set_field( file_name, field, old_value, new_value, regex = nil )
|
347
|
+
|
348
|
+
map = {old_value => new_value}
|
349
|
+
|
350
|
+
return file_set_field_by_map(file_name, field, map, regex)
|
351
|
+
end
|
352
|
+
|
353
|
+
end
|