data_miner 2.0.1 → 2.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/.gitignore CHANGED
@@ -1,10 +1,8 @@
1
- *.sw?
2
1
  .DS_Store
3
- coverage
4
- rdoc
5
- pkg
6
- test/test.sqlite3
7
- data_miner.log
2
+ /coverage
3
+ /rdoc
4
+ /pkg
8
5
  Gemfile.lock
9
6
  *.gem
10
- test.log
7
+ /.yardoc
8
+ /doc
data/CHANGELOG CHANGED
@@ -1,3 +1,16 @@
1
+ 2.0.2 / 2012-05-04
2
+
3
+ * Breaking changes
4
+
5
+ * Import descriptions are no longer optional
6
+ * Import options are no longer optional (but then, they never were)
7
+
8
+ * Enhancements
9
+
10
+ * Real documentation!
11
+ * Replace class-level mutexes with simple Thread.exclusive calls
12
+ * Simplified DataMiner::Dictionary
13
+
1
14
  2.0.1 / 2012-04-18
2
15
 
3
16
  * Enhancements
data/LICENSE CHANGED
@@ -1,4 +1,4 @@
1
- Copyright (c) 2011 Brighter Planet
1
+ Copyright (c) 2012 Brighter Planet
2
2
 
3
3
  Permission is hereby granted, free of charge, to any person obtaining
4
4
  a copy of this software and associated documentation files (the
@@ -0,0 +1,112 @@
1
+ # data_miner
2
+
3
+ Download, pull out of a ZIP/TAR/GZ/BZ2 archive, parse, correct, and import XLS, ODS, XML, CSV, HTML, etc. into your ActiveRecord models.
4
+
5
+ Tested in MRI 1.8.7+, MRI 1.9.2+, and JRuby 1.6.7+. Thread safe.
6
+
7
+ ## Real-world usage
8
+
9
+ <p><a href="http://brighterplanet.com"><img src="https://s3.amazonaws.com/static.brighterplanet.com/assets/logos/flush-left/inline/green/rasterized/brighter_planet-160-transparent.png" alt="Brighter Planet logo"/></a></p>
10
+
11
+ We use `data_miner` for [data science at Brighter Planet](http://brighterplanet.com/research) and in production at
12
+
13
+ * [Brighter Planet's reference data web service](http://data.brighterplanet.com)
14
+ * [Brighter Planet's impact estimate web service](http://impact.brighterplanet.com)
15
+
16
+ The killer combination for us is:
17
+
18
+ 1. [`active_record_inline_schema`](https://github.com/seamusabshere/active_record_inline_schema) - define table structure
19
+ 2. [`remote_table`](https://github.com/seamusabshere/remote_table) - download data and parse it
20
+ 3. [`errata`](https://github.com/seamusabshere/errata) - apply corrections in a transparent way
21
+ 4. [`data_miner`](https://github.com/seamusabshere/remote_table) (this library!) - import data idempotently
22
+
23
+ ## Documentation
24
+
25
+ Check out the [extensive documentation](http://rdoc.info/github/seamusabshere/data_miner).
26
+
27
+ ## Quick start
28
+
29
+ You define <code>data_miner</code> blocks in your ActiveRecord models. For example, in <code>app/models/country.rb</code>:
30
+
31
+ class Country < ActiveRecord::Base
32
+ self.primary_key = 'iso_3166_code'
33
+
34
+ data_miner do
35
+ import("OpenGeoCode.org's Country Codes to Country Names list",
36
+ :url => 'http://opengeocode.org/download/countrynames.txt',
37
+ :format => :delimited,
38
+ :delimiter => '; ',
39
+ :headers => false,
40
+ :skip => 22) do
41
+ key :iso_3166_code, :field_number => 0
42
+ store :iso_3166_alpha_3_code, :field_number => 1
43
+ store :iso_3166_numeric_code, :field_number => 2
44
+ store :name, :field_number => 5
45
+ end
46
+ end
47
+ end
48
+
49
+ Now you can run:
50
+
51
+ >> Country.run_data_miner!
52
+ => nil
53
+
54
+ ## More advanced usage
55
+
56
+ The [`earth` library](https://github.com/brighterplanet/earth) has dozens of real-life examples showing how to download, pull out of a ZIP/TAR/BZ2 archive, parse, correct, and import CSVs, fixed-width files, ODS, XLS, XLSX, even HTML and XML:
57
+
58
+ <table>
59
+ <tr>
60
+ <th>Model</th>
61
+ <th>Highlights</th>
62
+ <th>Reference</th>
63
+ </tr>
64
+ <tr>
65
+ <td><a href="http://data.brighterplanet.com/aircraft">Aircraft</a></td>
66
+ <td>parsing Microsoft Frontpage HTML (!)</td>
67
+ <td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/air/aircraft/data_miner.rb">data_miner.rb</a></td>
68
+ </tr>
69
+ <tr>
70
+ <td><a href="http://data.brighterplanet.com/airports">Airports</a></td>
71
+ <td>forcing column names and use of <code>:select</code> block (<code>Proc</code>)</td>
72
+ <td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/air/airport/data_miner.rb">data_miner.rb</a></td>
73
+ </tr>
74
+ <tr>
75
+ <td><a href="http://data.brighterplanet.com/automobile_make_model_year_variants">Automobile model variants</a></td>
76
+ <td>super advanced usage of "custom parser" and errata</td>
77
+ <td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/automobile/automobile_make_model_year_variant/data_miner.rb">data_miner.rb</a></td>
78
+ </tr>
79
+ <tr>
80
+ <td><a href="http://data.brighterplanet.com/countries">Country</a></td>
81
+ <td>parsing CSV and a few other tricks</td>
82
+ <td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/locality/country/data_miner.rb">data_miner.rb</a></td>
83
+ </tr>
84
+ <tr>
85
+ <td><a href="http://data.brighterplanet.com/egrid_regions">EGRID regions</a></td>
86
+ <td>parsing XLS</td>
87
+ <td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/locality/egrid_region/data_miner.rb">data_miner.rb</a></td>
88
+ </tr>
89
+ <tr>
90
+ <td><a href="http://data.brighterplanet.com/flight_segments">Flight segment (stage)</a></td>
91
+ <td>super advanced usage of POSTing form data</td>
92
+ <td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/air/flight_segment/data_miner.rb">data_miner.rb</a></td>
93
+ </tr>
94
+ <tr>
95
+ <td><a href="http://data.brighterplanet.com/zip_codes">Zip codes</a></td>
96
+ <td>downloading a ZIP file and pulling an XLSX out of it</td>
97
+ <td><a href="https://github.com/brighterplanet/earth/blob/master/lib/earth/locality/zip_code.rb">data_miner.rb</a></td>
98
+ </tr>
99
+ </table>
100
+
101
+ And many more - look for the `data_miner.rb` file that corresponds to each model. Note that you would normally put the `data_miner` declaration right inside the ActiveRecord model file... it's kept separate in `earth` so that loading it is optional.
102
+
103
+ ## Authors
104
+
105
+ * Seamus Abshere <seamus@abshere.net>
106
+ * Andy Rossmeissl <andy@rossmeissl.net>
107
+ * Derek Kastner <dkastner@gmail.com>
108
+ * Ian Hough <ijhough@gmail.com>
109
+
110
+ ## Copyright
111
+
112
+ Copyright (c) 2012 Brighter Planet. See LICENSE for details.
@@ -7,8 +7,8 @@ Gem::Specification.new do |s|
7
7
  s.authors = ["Seamus Abshere", "Andy Rossmeissl", "Derek Kastner"]
8
8
  s.email = ["seamus@abshere.net"]
9
9
  s.homepage = "https://github.com/seamusabshere/data_miner"
10
- s.summary = %{Mine remote data into your ActiveRecord models.}
11
- s.description = %q{Mine remote data into your ActiveRecord models. You can also convert units.}
10
+ s.summary = %{Download, pull out of a ZIP/TAR/GZ/BZ2 archive, parse, correct, and import XLS, ODS, XML, CSV, HTML, etc. into your ActiveRecord models.}
11
+ s.description = %q{Download, pull out of a ZIP/TAR/GZ/BZ2 archive, parse, correct, and import XLS, ODS, XML, CSV, HTML, etc. into your ActiveRecord models. You can also convert units.}
12
12
 
13
13
  s.rubyforge_project = "data_miner"
14
14
 
@@ -14,7 +14,7 @@ if RUBY_VERSION >= '1.9'
14
14
  end
15
15
  end
16
16
 
17
- require 'data_miner/active_record_extensions'
17
+ require 'data_miner/active_record_class_methods'
18
18
  require 'data_miner/attribute'
19
19
  require 'data_miner/script'
20
20
  require 'data_miner/dictionary'
@@ -24,14 +24,13 @@ require 'data_miner/step/tap'
24
24
  require 'data_miner/step/process'
25
25
  require 'data_miner/run'
26
26
 
27
+ # A singleton class that holds global configuration for data mining.
28
+ #
29
+ # All of its instance methods are delegated to +DataMiner.instance+, so you can call +DataMiner.model_names+, for example.
30
+ #
31
+ # @see DataMiner::ActiveRecordClassMethods#data_miner Overview of how to define data miner scripts inside of ActiveRecord models.
27
32
  class DataMiner
28
33
  class << self
29
- delegate :perform, :to => :instance
30
- delegate :run, :to => :instance
31
- delegate :logger, :to => :instance
32
- delegate :logger=, :to => :instance
33
- delegate :model_names, :to => :instance
34
-
35
34
  # @private
36
35
  def downcase(str)
37
36
  defined?(::UnicodeUtils) ? ::UnicodeUtils.downcase(str) : str.downcase
@@ -48,16 +47,20 @@ class DataMiner
48
47
  end
49
48
  end
50
49
 
51
- MUTEX = ::Mutex.new
52
50
  INNER_SPACE = /[ ]+/
53
51
 
54
52
  include ::Singleton
55
53
 
56
54
  attr_writer :logger
57
55
 
56
+ # Run data miner scripts on models identified by their names. Defaults to all models.
57
+ #
58
+ # @param [optional, Array<String>] model_names Names of models to be run.
59
+ #
60
+ # @return [Array<DataMiner::Run>]
58
61
  def perform(model_names = DataMiner.model_names)
59
62
  Script.uniq do
60
- model_names.each do |model_name|
63
+ model_names.map do |model_name|
61
64
  model_name.constantize.run_data_miner!
62
65
  end
63
66
  end
@@ -66,8 +69,11 @@ class DataMiner
66
69
  # legacy
67
70
  alias :run :perform
68
71
 
72
+ # Where DataMiner logs to. Defaults to +Rails.logger+ or +ActiveRecord::Base.logger+ if either is available.
73
+ #
74
+ # @return [Logger]
69
75
  def logger
70
- @logger || MUTEX.synchronize do
76
+ @logger || ::Thread.exclusive do
71
77
  @logger ||= if defined?(::Rails)
72
78
  ::Rails.logger
73
79
  elsif defined?(::ActiveRecord) and active_record_logger = ::ActiveRecord::Base.logger
@@ -79,12 +85,20 @@ class DataMiner
79
85
  end
80
86
  end
81
87
 
88
+ # Names of the models that have defined a data miner script.
89
+ #
90
+ # @note Models won't appear here until the files containing their data miner scripts have been +require+'d.
91
+ #
92
+ # @return [Set<String>]
82
93
  def model_names
83
- @model_names || MUTEX.synchronize do
94
+ @model_names || ::Thread.exclusive do
84
95
  @model_names ||= ::Set.new
85
96
  end
86
97
  end
87
98
 
99
+ class << self
100
+ delegate(*DataMiner.instance_methods(false), :to => :instance)
101
+ end
88
102
  end
89
103
 
90
- ::ActiveRecord::Base.extend ::DataMiner::ActiveRecordExtensions
104
+ ::ActiveRecord::Base.extend ::DataMiner::ActiveRecordClassMethods
@@ -0,0 +1,108 @@
1
+ require 'active_record'
2
+ require 'lock_method'
3
+
4
+ class DataMiner
5
+ # Class methods that are mixed into models (i.e. ActiveRecord::Base)
6
+ module ActiveRecordClassMethods
7
+ # Access this model's script.
8
+ #
9
+ # @return [DataMiner::Script] This model's data miner script.
10
+ def data_miner_script
11
+ @data_miner_script || ::Thread.exclusive do
12
+ @data_miner_script ||= DataMiner::Script.new(self)
13
+ end
14
+ end
15
+
16
+ # Access to recordkeeping.
17
+ #
18
+ # @return [ActiveRecord::Relation] Records of running the data miner script.
19
+ def data_miner_runs
20
+ DataMiner::Run.scoped :conditions => { :model_name => name }
21
+ end
22
+
23
+ # Run this model's script.
24
+ #
25
+ # @return [DataMiner::Run]
26
+ def run_data_miner!
27
+ data_miner_script.perform
28
+ end
29
+
30
+ # Run the data miner scripts of parent associations. Useful for dependencies. Safe to call using +process+.
31
+ #
32
+ # @note Used extensively in https://github.com/brighterplanet/earth
33
+ #
34
+ # @example Since Provinces depend on Countries, make sure Countries are data mined first
35
+ # class Country < ActiveRecord::Base
36
+ # [...some data miner script...]
37
+ # end
38
+ # class Province < ActiveRecord::Base
39
+ # belongs_to :country
40
+ # data_miner do
41
+ # [...]
42
+ # process "make sure my dependencies have been loaded" do
43
+ # run_data_miner_on_parent_associations!
44
+ # end
45
+ # [...]
46
+ # end
47
+ # end
48
+ #
49
+ # @return [Array<DataMiner::Run>]
50
+ def run_data_miner_on_parent_associations!
51
+ reflect_on_all_associations(:belongs_to).reject do |assoc|
52
+ assoc.options[:polymorphic]
53
+ end.map do |non_polymorphic_belongs_to_assoc|
54
+ non_polymorphic_belongs_to_assoc.klass.run_data_miner!
55
+ end
56
+ end
57
+
58
+ # Define a data miner script.
59
+ #
60
+ # @param [optional, Hash] options
61
+ # @option options [TrueClass, FalseClass] :append (false) Add steps to existing data miner script instead of starting from scratch.
62
+ #
63
+ # @yield [] The block defining the steps.
64
+ #
65
+ # @see DataMiner::Script#import
66
+ # @see DataMiner::Script#process
67
+ # @see DataMiner::Script#tap
68
+ #
69
+ # @example Creating steps
70
+ # class MyModel < ActiveRecord::Base
71
+ # data_miner do
72
+ # process [...]
73
+ # import [...]
74
+ # import [...yes, it's ok to have more than one import step...]
75
+ # process [...]
76
+ # [...etc...]
77
+ # end
78
+ # end
79
+ #
80
+ # @example From the README
81
+ # class Country < ActiveRecord::Base
82
+ # self.primary_key = 'iso_3166_code'
83
+ # data_miner do
84
+ # import("OpenGeoCode.org's Country Codes to Country Names list",
85
+ # :url => 'http://opengeocode.org/download/countrynames.txt',
86
+ # :format => :delimited,
87
+ # :delimiter => '; ',
88
+ # :headers => false,
89
+ # :skip => 22) do
90
+ # key :iso_3166_code, :field_number => 0
91
+ # store :iso_3166_alpha_3_code, :field_number => 1
92
+ # store :iso_3166_numeric_code, :field_number => 2
93
+ # store :name, :field_number => 5
94
+ # end
95
+ # end
96
+ # end
97
+ #
98
+ # @return [nil]
99
+ def data_miner(options = {}, &blk)
100
+ DataMiner.model_names.add name
101
+ unless options[:append]
102
+ @data_miner_script = nil
103
+ end
104
+ data_miner_script.append_block blk
105
+ nil
106
+ end
107
+ end
108
+ end
@@ -1,8 +1,14 @@
1
1
  require 'conversions'
2
2
 
3
3
  class DataMiner
4
+ # A mapping between a local model column and a remote data source column.
5
+ #
6
+ # @see DataMiner::ActiveRecordClassMethods#data_miner Overview of how to define data miner scripts inside of ActiveRecord models.
7
+ # @see DataMiner::Step::Import#store
8
+ # @see DataMiner::Step::Import#key
4
9
  class Attribute
5
10
  class << self
11
+ # @private
6
12
  def check_options(options)
7
13
  errors = []
8
14
  if options[:dictionary].is_a?(Dictionary)
@@ -18,26 +24,26 @@ class DataMiner
18
24
  end
19
25
  end
20
26
 
21
- VALID_OPTIONS = %w{
22
- from_units
23
- to_units
24
- static
25
- dictionary
26
- matcher
27
- field_name
28
- delimiter
29
- split
30
- units
31
- sprintf
32
- nullify
33
- overwrite
34
- upcase
35
- units_field_name
36
- units_field_number
37
- field_number
38
- chars
39
- synthesize
40
- }.map(&:to_sym)
27
+ VALID_OPTIONS = [
28
+ :from_units,
29
+ :to_units,
30
+ :static,
31
+ :dictionary,
32
+ :matcher,
33
+ :field_name,
34
+ :delimiter,
35
+ :split,
36
+ :units,
37
+ :sprintf,
38
+ :nullify,
39
+ :overwrite,
40
+ :upcase,
41
+ :units_field_name,
42
+ :units_field_number,
43
+ :field_number,
44
+ :chars,
45
+ :synthesize,
46
+ ]
41
47
 
42
48
  VALID_UNIT_DEFINITION_SETS = [
43
49
  [:units],
@@ -48,30 +54,102 @@ class DataMiner
48
54
  [:units_field_number, :to_units],
49
55
  ]
50
56
 
51
- DEFAULT_SPLIT = /\s+/
52
- DEFAULT_KEEP = 0
57
+ DEFAULT_SPLIT_PATTERN = /\s+/
58
+ DEFAULT_SPLIT_KEEP = 0
53
59
  DEFAULT_DELIMITER = ', '
54
60
  DEFAULT_NULLIFY = false
55
61
  DEFAULT_UPCASE = false
56
62
  DEFAULT_OVERWRITE = true
57
63
 
64
+ # @private
58
65
  attr_reader :step
66
+
67
+ # Local column name.
68
+ # @return [Symbol]
59
69
  attr_reader :name
70
+
71
+ # Synthesize a value by passing a proc that will receive +row+ and should return a final value.
72
+ #
73
+ # +row+ will be a +Hash+ with string keys or (less often) an +Array+
74
+ #
75
+ # @return [Proc]
60
76
  attr_reader :synthesize
77
+
78
+ # An object that will be sent +#match(row)+ and should return a final value.
79
+ #
80
+ # Can be specified as a String which will be constantized into a class and an object of that class instantized with no arguments.
81
+ #
82
+ # +row+ will be a +Hash+ with string keys or (less often) an +Array+
83
+ # @return [Object]
61
84
  attr_reader :matcher
85
+
86
+ # Index of where to find the data in the row, starting from zero.
87
+ #
88
+ # If you pass a +Range+, then multiple fields will be joined together.
89
+ #
90
+ # @return [Integer, Range]
62
91
  attr_reader :field_number
92
+
93
+ # Where to find the data in the row.
94
+ # @return [Symbol]
63
95
  attr_reader :field_name
64
- # For use when joining a range of field numbers
96
+
97
+ # A delimiter to be used when joining fields together into a single final value. Used when +:field_number+ is a +Range+. Defaults to DEFAULT_DELIMITER.
98
+ # @return [String]
65
99
  attr_reader :delimiter
100
+
101
+ # Which characters in a field to keep. Zero-based.
102
+ # @return [Range]
66
103
  attr_reader :chars
104
+
105
+ # How to split a field. You specify two options:
106
+ #
107
+ # +:pattern+: what to split on. Defaults to DEFAULT_SPLIT_PATTERN.
108
+ # +:keep+: which of elements resulting from the split to keep. Defaults to DEFAULT_SPLIT_KEEP.
109
+ #
110
+ # @return [Hash]
67
111
  attr_reader :split
112
+
113
+ # Final units. May invoke a conversion using https://github.com/seamusabshere/conversions
114
+ #
115
+ # If a local column named +[name]_units+ exists, it will be populated with this value.
116
+ #
117
+ # @return [Symbol]
68
118
  attr_reader :to_units
119
+
120
+ # Initial units. May invoke a conversion using https://github.com/seamusabshere/conversions
121
+ # @return [Symbol]
69
122
  attr_reader :from_units
123
+
124
+ # If every row specifies its own units, index of where to find the units. Zero-based.
125
+ # @return [Integer]
70
126
  attr_reader :units_field_number
127
+
128
+ # If every row specifies its own units, where to find the units.
129
+ # @return [Symbol]
71
130
  attr_reader :units_field_name
131
+
132
+ # A +sprintf+-style format to apply.
133
+ # @return [String]
72
134
  attr_reader :sprintf
135
+
136
+ # A static value to be used.
137
+ # @return [String,Numeric,TrueClass,FalseClass,Object]
73
138
  attr_reader :static
74
139
 
140
+ # Whether to nullify the value in a local column if it was not previously null. Defaults to DEFAULT_NULLIFY.
141
+ # @return [TrueClass,FalseClass]
142
+ attr_reader :nullify
143
+
144
+ # Whether to upcase value. Defaults to DEFAULT_UPCASE.
145
+ # @return [TrueClass,FalseClass]
146
+ attr_reader :upcase
147
+
148
+ # Whether to overwrite the value in a local column if it is not null. Defaults to DEFAULT_OVERWRITE.
149
+ # @return [TrueClass,FalseClass]
150
+ attr_reader :overwrite
151
+
152
+ # @private
75
153
  def initialize(step, name, options = {})
76
154
  options = options.symbolize_keys
77
155
  if (errors = Attribute.check_options(options)).any?
@@ -81,7 +159,7 @@ class DataMiner
81
159
  @name = name
82
160
  @synthesize = options[:synthesize]
83
161
  if @dictionary_boolean = options.has_key?(:dictionary)
84
- @dictionary_options = options[:dictionary]
162
+ @dictionary_settings = options[:dictionary]
85
163
  end
86
164
  @matcher = options[:matcher].is_a?(::String) ? options[:matcher].constantize.new : options[:matcher]
87
165
  if @static_boolean = options.has_key?(:static)
@@ -94,52 +172,42 @@ class DataMiner
94
172
  if split = options[:split]
95
173
  @split = split.symbolize_keys
96
174
  end
97
- @nullify_boolean = options.fetch :nullify, DEFAULT_NULLIFY
98
- @upcase_boolean = options.fetch :upcase, DEFAULT_UPCASE
175
+ @nullify = options.fetch :nullify, DEFAULT_NULLIFY
176
+ @upcase = options.fetch :upcase, DEFAULT_UPCASE
99
177
  @from_units = options[:from_units]
100
178
  @to_units = options[:to_units] || options[:units]
101
179
  @sprintf = options[:sprintf]
102
- @overwrite_boolean = options.fetch :overwrite, DEFAULT_OVERWRITE
180
+ @overwrite = options.fetch :overwrite, DEFAULT_OVERWRITE
103
181
  @units_field_name = options[:units_field_name]
104
182
  @units_field_number = options[:units_field_number]
105
183
  @dictionary_mutex = ::Mutex.new
106
184
  end
107
185
 
108
- def model
109
- step.model
110
- end
111
-
112
- def static?
113
- @static_boolean
114
- end
115
-
116
- def nullify?
117
- @nullify_boolean
118
- end
119
-
120
- def upcase?
121
- @upcase_boolean
122
- end
123
-
124
- def dictionary?
125
- @dictionary_boolean
126
- end
127
-
128
- def convert?
129
- from_units.present? or units_field_name.present? or units_field_number.present?
186
+ # Dictionary for translating.
187
+ #
188
+ # You pass a +Hash+ of options which is used to initialize a +DataMiner::Dictionary+.
189
+ #
190
+ # @return [DataMiner::Dictionary]
191
+ def dictionary
192
+ @dictionary || @dictionary_mutex.synchronize do
193
+ @dictionary ||= Dictionary.new(@dictionary_settings)
194
+ end
130
195
  end
131
196
 
132
- def units?
133
- to_units.present? or units_field_name.present? or units_field_number.present?
197
+ # @private
198
+ def set_from_row(local_record, remote_row)
199
+ if overwrite or local_record.send(name).nil?
200
+ local_record.send "#{name}=", read(remote_row)
201
+ end
202
+ if units? and ((final_to_units = (to_units || read_units(remote_row))) or nullify)
203
+ local_record.send "#{name}_units=", final_to_units
204
+ end
134
205
  end
135
206
 
136
- def overwrite?
137
- @overwrite_boolean
138
- end
139
-
207
+ # @private
140
208
  def read(row)
141
- if matcher and matched_row = matcher.match(row)
142
- return matched_row
209
+ if matcher and matcher_output = matcher.match(row)
210
+ return matcher_output
143
211
  end
144
212
  if synthesize
145
213
  return synthesize.call(row)
@@ -168,15 +236,15 @@ class DataMiner
168
236
  value = value[chars]
169
237
  end
170
238
  if split
171
- pattern = split.fetch :pattern, DEFAULT_SPLIT
172
- keep = split.fetch :keep, DEFAULT_KEEP
239
+ pattern = split.fetch :pattern, DEFAULT_SPLIT_PATTERN
240
+ keep = split.fetch :keep, DEFAULT_SPLIT_KEEP
173
241
  value = value.to_s.split(pattern)[keep].to_s
174
242
  end
175
243
  value = DataMiner.compress_whitespace value
176
- if nullify? and value.blank?
244
+ if nullify and value.blank?
177
245
  return
178
246
  end
179
- if upcase?
247
+ if upcase
180
248
  value = DataMiner.upcase value
181
249
  end
182
250
  if convert?
@@ -201,27 +269,33 @@ class DataMiner
201
269
  value
202
270
  end
203
271
 
204
- def set_from_row(target, row)
205
- if overwrite? or target.send(name).nil?
206
- target.send "#{name}=", read(row)
207
- end
208
- if units? and ((final_to_units = (to_units || read_units(row))) or nullify?)
209
- target.send "#{name}_units=", final_to_units
210
- end
211
- end
212
-
213
- def dictionary
214
- @dictionary || @dictionary_mutex.synchronize do
215
- @dictionary ||= Dictionary.new(@dictionary_options)
216
- end
217
- end
218
-
272
+ # @private
219
273
  def refresh
220
274
  @dictionary = nil
221
275
  end
222
276
 
223
277
  private
224
278
 
279
+ def model
280
+ step.model
281
+ end
282
+
283
+ def static?
284
+ @static_boolean
285
+ end
286
+
287
+ def dictionary?
288
+ @dictionary_boolean
289
+ end
290
+
291
+ def convert?
292
+ from_units.present? or units_field_name.present? or units_field_number.present?
293
+ end
294
+
295
+ def units?
296
+ to_units.present? or units_field_name.present? or units_field_number.present?
297
+ end
298
+
225
299
  def read_units(row)
226
300
  if units = row[units_field_name || units_field_number]
227
301
  DataMiner.compress_whitespace(units).underscore.to_sym