data_miner 2.0.1 → 2.0.2

Sign up to get free protection for your applications and to get access to all the features.
data/.gitignore CHANGED
@@ -1,10 +1,8 @@
1
- *.sw?
2
1
  .DS_Store
3
- coverage
4
- rdoc
5
- pkg
6
- test/test.sqlite3
7
- data_miner.log
2
+ /coverage
3
+ /rdoc
4
+ /pkg
8
5
  Gemfile.lock
9
6
  *.gem
10
- test.log
7
+ /.yardoc
8
+ /doc
data/CHANGELOG CHANGED
@@ -1,3 +1,16 @@
1
+ 2.0.2 / 2012-05-04
2
+
3
+ * Breaking changes
4
+
5
+ * Import descriptions are no longer optional
6
+ * Import options are no longer optional (but then, they never were)
7
+
8
+ * Enhancements
9
+
10
+ * Real documentation!
11
+ * Replace class-level mutexes with simple Thread.exclusive calls
12
+ * Simplified DataMiner::Dictionary
13
+
1
14
  2.0.1 / 2012-04-18
2
15
 
3
16
  * Enhancements
data/LICENSE CHANGED
@@ -1,4 +1,4 @@
1
- Copyright (c) 2011 Brighter Planet
1
+ Copyright (c) 2012 Brighter Planet
2
2
 
3
3
  Permission is hereby granted, free of charge, to any person obtaining
4
4
  a copy of this software and associated documentation files (the
@@ -0,0 +1,112 @@
1
+ # data_miner
2
+
3
+ Download, pull out of a ZIP/TAR/GZ/BZ2 archive, parse, correct, and import XLS, ODS, XML, CSV, HTML, etc. into your ActiveRecord models.
4
+
5
+ Tested in MRI 1.8.7+, MRI 1.9.2+, and JRuby 1.6.7+. Thread safe.
6
+
7
+ ## Real-world usage
8
+
9
+ <p><a href="http://brighterplanet.com"><img src="https://s3.amazonaws.com/static.brighterplanet.com/assets/logos/flush-left/inline/green/rasterized/brighter_planet-160-transparent.png" alt="Brighter Planet logo"/></a></p>
10
+
11
+ We use `data_miner` for [data science at Brighter Planet](http://brighterplanet.com/research) and in production at
12
+
13
+ * [Brighter Planet's reference data web service](http://data.brighterplanet.com)
14
+ * [Brighter Planet's impact estimate web service](http://impact.brighterplanet.com)
15
+
16
+ The killer combination for us is:
17
+
18
+ 1. [`active_record_inline_schema`](https://github.com/seamusabshere/active_record_inline_schema) - define table structure
19
+ 2. [`remote_table`](https://github.com/seamusabshere/remote_table) - download data and parse it
20
+ 3. [`errata`](https://github.com/seamusabshere/errata) - apply corrections in a transparent way
21
+ 4. [`data_miner`](https://github.com/seamusabshere/remote_table) (this library!) - import data idempotently
22
+
23
+ ## Documentation
24
+
25
+ Check out the [extensive documentation](http://rdoc.info/github/seamusabshere/data_miner).
26
+
27
+ ## Quick start
28
+
29
+ You define <code>data_miner</code> blocks in your ActiveRecord models. For example, in <code>app/models/country.rb</code>:
30
+
31
+ class Country < ActiveRecord::Base
32
+ self.primary_key = 'iso_3166_code'
33
+
34
+ data_miner do
35
+ import("OpenGeoCode.org's Country Codes to Country Names list",
36
+ :url => 'http://opengeocode.org/download/countrynames.txt',
37
+ :format => :delimited,
38
+ :delimiter => '; ',
39
+ :headers => false,
40
+ :skip => 22) do
41
+ key :iso_3166_code, :field_number => 0
42
+ store :iso_3166_alpha_3_code, :field_number => 1
43
+ store :iso_3166_numeric_code, :field_number => 2
44
+ store :name, :field_number => 5
45
+ end
46
+ end
47
+ end
48
+
49
+ Now you can run:
50
+
51
+ >> Country.run_data_miner!
52
+ => nil
53
+
54
+ ## More advanced usage
55
+
56
+ The [`earth` library](https://github.com/brighterplanet/earth) has dozens of real-life examples showing how to download, pull out of a ZIP/TAR/BZ2 archive, parse, correct, and import CSVs, fixed-width files, ODS, XLS, XLSX, even HTML and XML:
57
+
58
+ <table>
59
+ <tr>
60
+ <th>Model</th>
61
+ <th>Highlights</th>
62
+ <th>Reference</th>
63
+ </tr>
64
+ <tr>
65
+ <td><a href="http://data.brighterplanet.com/aircraft">Aircraft</a></td>
66
+ <td>parsing Microsoft Frontpage HTML (!)</td>
67
+ <td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/air/aircraft/data_miner.rb">data_miner.rb</a></td>
68
+ </tr>
69
+ <tr>
70
+ <td><a href="http://data.brighterplanet.com/airports">Airports</a></td>
71
+ <td>forcing column names and use of <code>:select</code> block (<code>Proc</code>)</td>
72
+ <td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/air/airport/data_miner.rb">data_miner.rb</a></td>
73
+ </tr>
74
+ <tr>
75
+ <td><a href="http://data.brighterplanet.com/automobile_make_model_year_variants">Automobile model variants</a></td>
76
+ <td>super advanced usage of "custom parser" and errata</td>
77
+ <td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/automobile/automobile_make_model_year_variant/data_miner.rb">data_miner.rb</a></td>
78
+ </tr>
79
+ <tr>
80
+ <td><a href="http://data.brighterplanet.com/countries">Country</a></td>
81
+ <td>parsing CSV and a few other tricks</td>
82
+ <td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/locality/country/data_miner.rb">data_miner.rb</a></td>
83
+ </tr>
84
+ <tr>
85
+ <td><a href="http://data.brighterplanet.com/egrid_regions">EGRID regions</a></td>
86
+ <td>parsing XLS</td>
87
+ <td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/locality/egrid_region/data_miner.rb">data_miner.rb</a></td>
88
+ </tr>
89
+ <tr>
90
+ <td><a href="http://data.brighterplanet.com/flight_segments">Flight segment (stage)</a></td>
91
+ <td>super advanced usage of POSTing form data</td>
92
+ <td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/air/flight_segment/data_miner.rb">data_miner.rb</a></td>
93
+ </tr>
94
+ <tr>
95
+ <td><a href="http://data.brighterplanet.com/zip_codes">Zip codes</a></td>
96
+ <td>downloading a ZIP file and pulling an XLSX out of it</td>
97
+ <td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/locality/zip_code.rb">data_miner.rb</a></td>
98
+ </tr>
99
+ </table>
100
+
101
+ And many more - look for the `data_miner.rb` file that corresponds to each model. Note that you would normally put the `data_miner` declaration right inside the ActiveRecord model file... it's kept separate in `earth` so that loading it is optional.
102
+
103
+ ## Authors
104
+
105
+ * Seamus Abshere <seamus@abshere.net>
106
+ * Andy Rossmeissl <andy@rossmeissl.net>
107
+ * Derek Kastner <dkastner@gmail.com>
108
+ * Ian Hough <ijhough@gmail.com>
109
+
110
+ ## Copyright
111
+
112
+ Copyright (c) 2012 Brighter Planet. See LICENSE for details.
@@ -7,8 +7,8 @@ Gem::Specification.new do |s|
7
7
  s.authors = ["Seamus Abshere", "Andy Rossmeissl", "Derek Kastner"]
8
8
  s.email = ["seamus@abshere.net"]
9
9
  s.homepage = "https://github.com/seamusabshere/data_miner"
10
- s.summary = %{Mine remote data into your ActiveRecord models.}
11
- s.description = %q{Mine remote data into your ActiveRecord models. You can also convert units.}
10
+ s.summary = %{Download, pull out of a ZIP/TAR/GZ/BZ2 archive, parse, correct, and import XLS, ODS, XML, CSV, HTML, etc. into your ActiveRecord models.}
11
+ s.description = %q{Download, pull out of a ZIP/TAR/GZ/BZ2 archive, parse, correct, and import XLS, ODS, XML, CSV, HTML, etc. into your ActiveRecord models. You can also convert units.}
12
12
 
13
13
  s.rubyforge_project = "data_miner"
14
14
 
@@ -14,7 +14,7 @@ if RUBY_VERSION >= '1.9'
14
14
  end
15
15
  end
16
16
 
17
- require 'data_miner/active_record_extensions'
17
+ require 'data_miner/active_record_class_methods'
18
18
  require 'data_miner/attribute'
19
19
  require 'data_miner/script'
20
20
  require 'data_miner/dictionary'
@@ -24,14 +24,13 @@ require 'data_miner/step/tap'
24
24
  require 'data_miner/step/process'
25
25
  require 'data_miner/run'
26
26
 
27
+ # A singleton class that holds global configuration for data mining.
28
+ #
29
+ # All of its instance methods are delegated to +DataMiner.instance+, so you can call +DataMiner.model_names+, for example.
30
+ #
31
+ # @see DataMiner::ActiveRecordClassMethods#data_miner Overview of how to define data miner scripts inside of ActiveRecord models.
27
32
  class DataMiner
28
33
  class << self
29
- delegate :perform, :to => :instance
30
- delegate :run, :to => :instance
31
- delegate :logger, :to => :instance
32
- delegate :logger=, :to => :instance
33
- delegate :model_names, :to => :instance
34
-
35
34
  # @private
36
35
  def downcase(str)
37
36
  defined?(::UnicodeUtils) ? ::UnicodeUtils.downcase(str) : str.downcase
@@ -48,16 +47,20 @@ class DataMiner
48
47
  end
49
48
  end
50
49
 
51
- MUTEX = ::Mutex.new
52
50
  INNER_SPACE = /[ ]+/
53
51
 
54
52
  include ::Singleton
55
53
 
56
54
  attr_writer :logger
57
55
 
56
+ # Run data miner scripts on models identified by their names. Defaults to all models.
57
+ #
58
+ # @param [optional, Array<String>] model_names Names of models to be run.
59
+ #
60
+ # @return [Array<DataMiner::Run>]
58
61
  def perform(model_names = DataMiner.model_names)
59
62
  Script.uniq do
60
- model_names.each do |model_name|
63
+ model_names.map do |model_name|
61
64
  model_name.constantize.run_data_miner!
62
65
  end
63
66
  end
@@ -66,8 +69,11 @@ class DataMiner
66
69
  # legacy
67
70
  alias :run :perform
68
71
 
72
+ # Where DataMiner logs to. Defaults to +Rails.logger+ or +ActiveRecord::Base.logger+ if either is available.
73
+ #
74
+ # @return [Logger]
69
75
  def logger
70
- @logger || MUTEX.synchronize do
76
+ @logger || ::Thread.exclusive do
71
77
  @logger ||= if defined?(::Rails)
72
78
  ::Rails.logger
73
79
  elsif defined?(::ActiveRecord) and active_record_logger = ::ActiveRecord::Base.logger
@@ -79,12 +85,20 @@ class DataMiner
79
85
  end
80
86
  end
81
87
 
88
+ # Names of the models that have defined a data miner script.
89
+ #
90
+ # @note Models won't appear here until the files containing their data miner scripts have been +require+'d.
91
+ #
92
+ # @return [Set<String>]
82
93
  def model_names
83
- @model_names || MUTEX.synchronize do
94
+ @model_names || ::Thread.exclusive do
84
95
  @model_names ||= ::Set.new
85
96
  end
86
97
  end
87
98
 
99
+ class << self
100
+ delegate(*DataMiner.instance_methods(false), :to => :instance)
101
+ end
88
102
  end
89
103
 
90
- ::ActiveRecord::Base.extend ::DataMiner::ActiveRecordExtensions
104
+ ::ActiveRecord::Base.extend ::DataMiner::ActiveRecordClassMethods
@@ -0,0 +1,108 @@
1
+ require 'active_record'
2
+ require 'lock_method'
3
+
4
+ class DataMiner
5
+ # Class methods that are mixed into models (i.e. ActiveRecord::Base)
6
+ module ActiveRecordClassMethods
7
+ # Access this model's script.
8
+ #
9
+ # @return [DataMiner::Script] This model's data miner script.
10
+ def data_miner_script
11
+ @data_miner_script || ::Thread.exclusive do
12
+ @data_miner_script ||= DataMiner::Script.new(self)
13
+ end
14
+ end
15
+
16
+ # Access to recordkeeping.
17
+ #
18
+ # @return [ActiveRecord::Relation] Records of running the data miner script.
19
+ def data_miner_runs
20
+ DataMiner::Run.scoped :conditions => { :model_name => name }
21
+ end
22
+
23
+ # Run this model's script.
24
+ #
25
+ # @return [DataMiner::Run]
26
+ def run_data_miner!
27
+ data_miner_script.perform
28
+ end
29
+
30
+ # Run the data miner scripts of parent associations. Useful for dependencies. Safe to call using +process+.
31
+ #
32
+ # @note Used extensively in https://github.com/brighterplanet/earth
33
+ #
34
+ # @example Since Provinces depend on Countries, make sure Countries are data mined first
35
+ # class Country < ActiveRecord::Base
36
+ # [...some data miner script...]
37
+ # end
38
+ # class Province < ActiveRecord::Base
39
+ # belongs_to :country
40
+ # data_miner do
41
+ # [...]
42
+ # process "make sure my dependencies have been loaded" do
43
+ # run_data_miner_on_parent_associations!
44
+ # end
45
+ # [...]
46
+ # end
47
+ # end
48
+ #
49
+ # @return [Array<DataMiner::Run>]
50
+ def run_data_miner_on_parent_associations!
51
+ reflect_on_all_associations(:belongs_to).reject do |assoc|
52
+ assoc.options[:polymorphic]
53
+ end.map do |non_polymorphic_belongs_to_assoc|
54
+ non_polymorphic_belongs_to_assoc.klass.run_data_miner!
55
+ end
56
+ end
57
+
58
+ # Define a data miner script.
59
+ #
60
+ # @param [optional, Hash] options
61
+ # @option options [TrueClass, FalseClass] :append (false) Add steps to existing data miner script instead of starting from scratch.
62
+ #
63
+ # @yield [] The block defining the steps.
64
+ #
65
+ # @see DataMiner::Script#import
66
+ # @see DataMiner::Script#process
67
+ # @see DataMiner::Script#tap
68
+ #
69
+ # @example Creating steps
70
+ # class MyModel < ActiveRecord::Base
71
+ # data_miner do
72
+ # process [...]
73
+ # import [...]
74
+ # import [...yes, it's ok to have more than one import step...]
75
+ # process [...]
76
+ # [...etc...]
77
+ # end
78
+ # end
79
+ #
80
+ # @example From the README
81
+ # class Country < ActiveRecord::Base
82
+ # self.primary_key = 'iso_3166_code'
83
+ # data_miner do
84
+ # import("OpenGeoCode.org's Country Codes to Country Names list",
85
+ # :url => 'http://opengeocode.org/download/countrynames.txt',
86
+ # :format => :delimited,
87
+ # :delimiter => '; ',
88
+ # :headers => false,
89
+ # :skip => 22) do
90
+ # key :iso_3166_code, :field_number => 0
91
+ # store :iso_3166_alpha_3_code, :field_number => 1
92
+ # store :iso_3166_numeric_code, :field_number => 2
93
+ # store :name, :field_number => 5
94
+ # end
95
+ # end
96
+ # end
97
+ #
98
+ # @return [nil]
99
+ def data_miner(options = {}, &blk)
100
+ DataMiner.model_names.add name
101
+ unless options[:append]
102
+ @data_miner_script = nil
103
+ end
104
+ data_miner_script.append_block blk
105
+ nil
106
+ end
107
+ end
108
+ end
@@ -1,8 +1,14 @@
1
1
  require 'conversions'
2
2
 
3
3
  class DataMiner
4
+ # A mapping between a local model column and a remote data source column.
5
+ #
6
+ # @see DataMiner::ActiveRecordClassMethods#data_miner Overview of how to define data miner scripts inside of ActiveRecord models.
7
+ # @see DataMiner::Step::Import#store
8
+ # @see DataMiner::Step::Import#key
4
9
  class Attribute
5
10
  class << self
11
+ # @private
6
12
  def check_options(options)
7
13
  errors = []
8
14
  if options[:dictionary].is_a?(Dictionary)
@@ -18,26 +24,26 @@ class DataMiner
18
24
  end
19
25
  end
20
26
 
21
- VALID_OPTIONS = %w{
22
- from_units
23
- to_units
24
- static
25
- dictionary
26
- matcher
27
- field_name
28
- delimiter
29
- split
30
- units
31
- sprintf
32
- nullify
33
- overwrite
34
- upcase
35
- units_field_name
36
- units_field_number
37
- field_number
38
- chars
39
- synthesize
40
- }.map(&:to_sym)
27
+ VALID_OPTIONS = [
28
+ :from_units,
29
+ :to_units,
30
+ :static,
31
+ :dictionary,
32
+ :matcher,
33
+ :field_name,
34
+ :delimiter,
35
+ :split,
36
+ :units,
37
+ :sprintf,
38
+ :nullify,
39
+ :overwrite,
40
+ :upcase,
41
+ :units_field_name,
42
+ :units_field_number,
43
+ :field_number,
44
+ :chars,
45
+ :synthesize,
46
+ ]
41
47
 
42
48
  VALID_UNIT_DEFINITION_SETS = [
43
49
  [:units],
@@ -48,30 +54,102 @@ class DataMiner
48
54
  [:units_field_number, :to_units],
49
55
  ]
50
56
 
51
- DEFAULT_SPLIT = /\s+/
52
- DEFAULT_KEEP = 0
57
+ DEFAULT_SPLIT_PATTERN = /\s+/
58
+ DEFAULT_SPLIT_KEEP = 0
53
59
  DEFAULT_DELIMITER = ', '
54
60
  DEFAULT_NULLIFY = false
55
61
  DEFAULT_UPCASE = false
56
62
  DEFAULT_OVERWRITE = true
57
63
 
64
+ # @private
58
65
  attr_reader :step
66
+
67
+ # Local column name.
68
+ # @return [Symbol]
59
69
  attr_reader :name
70
+
71
+ # Synthesize a value by passing a proc that will receive +row+ and should return a final value.
72
+ #
73
+ # +row+ will be a +Hash+ with string keys or (less often) an +Array+
74
+ #
75
+ # @return [Proc]
60
76
  attr_reader :synthesize
77
+
78
+ # An object that will be sent +#match(row)+ and should return a final value.
79
+ #
80
+ # Can be specified as a String which will be constantized into a class and an object of that class instantized with no arguments.
81
+ #
82
+ # +row+ will be a +Hash+ with string keys or (less often) an +Array+
83
+ # @return [Object]
61
84
  attr_reader :matcher
85
+
86
+ # Index of where to find the data in the row, starting from zero.
87
+ #
88
+ # If you pass a +Range+, then multiple fields will be joined together.
89
+ #
90
+ # @return [Integer, Range]
62
91
  attr_reader :field_number
92
+
93
+ # Where to find the data in the row.
94
+ # @return [Symbol]
63
95
  attr_reader :field_name
64
- # For use when joining a range of field numbers
96
+
97
+ # A delimiter to be used when joining fields together into a single final value. Used when +:field_number+ is a +Range+. Defaults to DEFAULT_DELIMITER.
98
+ # @return [String]
65
99
  attr_reader :delimiter
100
+
101
+ # Which characters in a field to keep. Zero-based.
102
+ # @return [Range]
66
103
  attr_reader :chars
104
+
105
+ # How to split a field. You specify two options:
106
+ #
107
+ # +:pattern+: what to split on. Defaults to DEFAULT_SPLIT_PATTERN.
108
+ # +:keep+: which of elements resulting from the split to keep. Defaults to DEFAULT_SPLIT_KEEP.
109
+ #
110
+ # @return [Hash]
67
111
  attr_reader :split
112
+
113
+ # Final units. May invoke a conversion using https://github.com/seamusabshere/conversions
114
+ #
115
+ # If a local column named +[name]_units+ exists, it will be populated with this value.
116
+ #
117
+ # @return [Symbol]
68
118
  attr_reader :to_units
119
+
120
+ # Initial units. May invoke a conversion using https://github.com/seamusabshere/conversions
121
+ # @return [Symbol]
69
122
  attr_reader :from_units
123
+
124
+ # If every row specifies its own units, index of where to find the units. Zero-based.
125
+ # @return [Integer]
70
126
  attr_reader :units_field_number
127
+
128
+ # If every row specifies its own units, where to find the units.
129
+ # @return [Symbol]
71
130
  attr_reader :units_field_name
131
+
132
+ # A +sprintf+-style format to apply.
133
+ # @return [String]
72
134
  attr_reader :sprintf
135
+
136
+ # A static value to be used.
137
+ # @return [String,Numeric,TrueClass,FalseClass,Object]
73
138
  attr_reader :static
74
139
 
140
+ # Whether to nullify the value in a local column if it was not previously null. Defaults to DEFAULT_NULLIFY.
141
+ # @return [TrueClass,FalseClass]
142
+ attr_reader :nullify
143
+
144
+ # Whether to upcase value. Defaults to DEFAULT_UPCASE.
145
+ # @return [TrueClass,FalseClass]
146
+ attr_reader :upcase
147
+
148
+ # Whether to overwrite the value in a local column if it is not null. Defaults to DEFAULT_OVERWRITE.
149
+ # @return [TrueClass,FalseClass]
150
+ attr_reader :overwrite
151
+
152
+ # @private
75
153
  def initialize(step, name, options = {})
76
154
  options = options.symbolize_keys
77
155
  if (errors = Attribute.check_options(options)).any?
@@ -81,7 +159,7 @@ class DataMiner
81
159
  @name = name
82
160
  @synthesize = options[:synthesize]
83
161
  if @dictionary_boolean = options.has_key?(:dictionary)
84
- @dictionary_options = options[:dictionary]
162
+ @dictionary_settings = options[:dictionary]
85
163
  end
86
164
  @matcher = options[:matcher].is_a?(::String) ? options[:matcher].constantize.new : options[:matcher]
87
165
  if @static_boolean = options.has_key?(:static)
@@ -94,52 +172,42 @@ class DataMiner
94
172
  if split = options[:split]
95
173
  @split = split.symbolize_keys
96
174
  end
97
- @nullify_boolean = options.fetch :nullify, DEFAULT_NULLIFY
98
- @upcase_boolean = options.fetch :upcase, DEFAULT_UPCASE
175
+ @nullify = options.fetch :nullify, DEFAULT_NULLIFY
176
+ @upcase = options.fetch :upcase, DEFAULT_UPCASE
99
177
  @from_units = options[:from_units]
100
178
  @to_units = options[:to_units] || options[:units]
101
179
  @sprintf = options[:sprintf]
102
- @overwrite_boolean = options.fetch :overwrite, DEFAULT_OVERWRITE
180
+ @overwrite = options.fetch :overwrite, DEFAULT_OVERWRITE
103
181
  @units_field_name = options[:units_field_name]
104
182
  @units_field_number = options[:units_field_number]
105
183
  @dictionary_mutex = ::Mutex.new
106
184
  end
107
185
 
108
- def model
109
- step.model
110
- end
111
-
112
- def static?
113
- @static_boolean
114
- end
115
-
116
- def nullify?
117
- @nullify_boolean
118
- end
119
-
120
- def upcase?
121
- @upcase_boolean
122
- end
123
-
124
- def dictionary?
125
- @dictionary_boolean
126
- end
127
-
128
- def convert?
129
- from_units.present? or units_field_name.present? or units_field_number.present?
186
+ # Dictionary for translating.
187
+ #
188
+ # You pass a +Hash+ of options which is used to initialize a +DataMiner::Dictionary+.
189
+ #
190
+ # @return [DataMiner::Dictionary]
191
+ def dictionary
192
+ @dictionary || @dictionary_mutex.synchronize do
193
+ @dictionary ||= Dictionary.new(@dictionary_settings)
194
+ end
130
195
  end
131
196
 
132
- def units?
133
- to_units.present? or units_field_name.present? or units_field_number.present?
197
+ # @private
198
+ def set_from_row(local_record, remote_row)
199
+ if overwrite or local_record.send(name).nil?
200
+ local_record.send "#{name}=", read(remote_row)
201
+ end
202
+ if units? and ((final_to_units = (to_units || read_units(remote_row))) or nullify)
203
+ local_record.send "#{name}_units=", final_to_units
204
+ end
134
205
  end
135
206
 
136
- def overwrite?
137
- @overwrite_boolean
138
- end
139
-
207
+ # @private
140
208
  def read(row)
141
- if matcher and matched_row = matcher.match(row)
142
- return matched_row
209
+ if matcher and matcher_output = matcher.match(row)
210
+ return matcher_output
143
211
  end
144
212
  if synthesize
145
213
  return synthesize.call(row)
@@ -168,15 +236,15 @@ class DataMiner
168
236
  value = value[chars]
169
237
  end
170
238
  if split
171
- pattern = split.fetch :pattern, DEFAULT_SPLIT
172
- keep = split.fetch :keep, DEFAULT_KEEP
239
+ pattern = split.fetch :pattern, DEFAULT_SPLIT_PATTERN
240
+ keep = split.fetch :keep, DEFAULT_SPLIT_KEEP
173
241
  value = value.to_s.split(pattern)[keep].to_s
174
242
  end
175
243
  value = DataMiner.compress_whitespace value
176
- if nullify? and value.blank?
244
+ if nullify and value.blank?
177
245
  return
178
246
  end
179
- if upcase?
247
+ if upcase
180
248
  value = DataMiner.upcase value
181
249
  end
182
250
  if convert?
@@ -201,27 +269,33 @@ class DataMiner
201
269
  value
202
270
  end
203
271
 
204
- def set_from_row(target, row)
205
- if overwrite? or target.send(name).nil?
206
- target.send "#{name}=", read(row)
207
- end
208
- if units? and ((final_to_units = (to_units || read_units(row))) or nullify?)
209
- target.send "#{name}_units=", final_to_units
210
- end
211
- end
212
-
213
- def dictionary
214
- @dictionary || @dictionary_mutex.synchronize do
215
- @dictionary ||= Dictionary.new(@dictionary_options)
216
- end
217
- end
218
-
272
+ # @private
219
273
  def refresh
220
274
  @dictionary = nil
221
275
  end
222
276
 
223
277
  private
224
278
 
279
+ def model
280
+ step.model
281
+ end
282
+
283
+ def static?
284
+ @static_boolean
285
+ end
286
+
287
+ def dictionary?
288
+ @dictionary_boolean
289
+ end
290
+
291
+ def convert?
292
+ from_units.present? or units_field_name.present? or units_field_number.present?
293
+ end
294
+
295
+ def units?
296
+ to_units.present? or units_field_name.present? or units_field_number.present?
297
+ end
298
+
225
299
  def read_units(row)
226
300
  if units = row[units_field_name || units_field_number]
227
301
  DataMiner.compress_whitespace(units).underscore.to_sym