datashift 0.13.0 → 0.14.0

Sign up to get free protection for your applications and to get access to all the features.
data/README.markdown CHANGED
@@ -1,5 +1,10 @@
1
1
  ## DataShift
2
2
 
3
+ - [Features](#features)
4
+ - [Installation](#installation)
5
+ - [Active Record - Import/Export](#Active Record - Import/Export)
6
+ - [License](#license)
7
+
3
8
  Provides tools to shift data between Excel/CSV files and Rails projects and Ruby applications
4
9
 
5
10
  Import and export models fully with all associations.
@@ -7,7 +12,8 @@ Import and export models fully with all associations.
7
12
  Comprehensive Wiki here : **https://github.com/autotelik/datashift/wiki**
8
13
 
9
14
  Specific command line tools and full Product loading for Spree E-Commerce
10
- now seperate gem at [datashift_spree](https://github.com/autotelik/datashift_spree "Datashift Spree")
15
+ now separate gem at [datashift_spree](https://github.com/autotelik/datashift_spree "Datashift Spree")
16
+
11
17
 
12
18
  ### Features
13
19
 
@@ -40,11 +46,15 @@ Many example Spreadsheets/CSV files in spec/fixtures, fully documented with comm
40
46
 
41
47
  Add gem 'datashift' to your Gemfile/bundle or use ```gem install```
42
48
 
43
- ```ruby gem 'datashift' ```
49
+ ```ruby
50
+ gem 'datashift'
51
+ ```
44
52
 
45
53
  For Spree support also add :
46
54
 
47
- ```ruby gem 'datashift_spree' ```
55
+ ```ruby
56
+ gem 'datashift_spree'
57
+ ```
48
58
 
49
59
  To use :
50
60
 
@@ -106,6 +116,26 @@ Provides high level tasks for exporting data to various sources, currently .xls
106
116
 
107
117
  bundle exec thor datashift:export:excel model=BlogPost result=BlogExport.xls
108
118
 
119
+
120
+ Import based on column headings with *Semi-Smart Name Lookup*
121
+
122
+ On import, first a dictionary of all possible attributes and associations is created for the AR class.
123
+
124
+ This enables lookup, of a user supplied name (column heading), managing white space, pluralisation etc .
125
+
126
+ Example usage, load from a file or spreadsheet where the column names are only
127
+ an approximation of the actual associations, so given 'Product Properties' heading,
128
+ finds real association 'product_properties' to send or call on the AR object
129
+
130
+
131
+ Can import/export 'belongs_to, 'has_many' and 'has_one' associations, including assignment of multiple objects
132
+ via either multiple columns, or via a DSL for creating multiple entries in a single (column).
133
+
134
+ The DSL can also be used to define which fields to lookup associations, and assign values to other fields.
135
+
136
+ See Wiki for more details on DSL syntax.
137
+
138
+ Supports inclusion of delegated attributes and normal instance methods as column headings.
109
139
 
110
140
  The library can be easily extended with Loaders to deal with non trivial cases,
111
141
  for example when multiple lookups required to find right association.
@@ -135,22 +165,8 @@ This data can be exported directly to CSV or Excel/OpenOffice spreadsheets.
135
165
  Column headings contain comments with full descriptions and instructions on syntax.
136
166
 
137
167
 
138
- ## Features
139
-
140
- - *Associations*
141
-
142
- Can import/export 'belongs_to, 'has_many' and 'has_one' associations, including assignment of multiple objects
143
- via either multiple columns, or via a DSL for creating multiple entries in a single (column).
168
+ ## Excel
144
169
 
145
- The DSL can also be used to define which fields to lookup associations, and assign values to other fields.
146
-
147
- See Wiki for more details on DSL syntax.
148
-
149
- Supports delegated attributes.
150
-
151
- - *High level wrappers around applications including Excel and Word
152
-
153
- Quickly and easily access common enterprise applications through Ruby
154
170
 
155
171
  MS Excel itself does not need to be installed.
156
172
 
@@ -160,55 +176,10 @@ This data can be exported directly to CSV or Excel/OpenOffice spreadsheets.
160
176
 
161
177
  The required POI jars are already included.
162
178
 
163
- - *Direct Excel export*
164
-
165
179
  Excel/OpenOffice spreadsheets are heavily used in many sectors, so direct support makes it
166
- easier and quicker to migrate your client's data into a Rails/ActiveRecord project.
167
-
168
- No need to save to CSV or map to YAML.
169
-
170
- - *Semi-Smart Name Lookup*
171
-
172
- Includes helper classes that find and store details of all possible associations on an AR class.
173
- Given a user supplied name, attempts to find the requested association.
174
-
175
- Example usage, load from a file or spreadsheet where the column names are only
176
- an approximation of the actual associations, so given 'Product Properties' heading,
177
- finds real association 'product_properties' to send or call on the AR object
178
-
179
-
180
-
181
- - *Thor Tasks*
182
-
183
- High level Thor CLIs are provided, only required to supply model class, and file location :
184
-
185
- thor datashift:import:excel model=MusicTrack input=MyTrackListing.xls
186
-
187
-
188
- - *Spree Tasks*
189
-
190
- Spree's product associations are non trivial so specific Rake tasks are also provided for loading Spree Producta
191
- with all associations and Image loading.
192
-
193
- thor datashift:spree:products input=C:\MyProducts.xls
194
-
195
-
196
- - *Seamless Spree Image loading can be achieved by ensuring SKU or class Name features in Image filename.
197
-
198
- Lookup is performed either via the SKU being prepended to the image name, or by the image name being equal to the **name attribute** of the klass in question.
199
-
200
- Images can be attached to any class defined with a suitable association. The class to use can be configured in rake task via
201
- parameter klass=Xyz.
202
-
203
- In the Spree tasks, this defaults to Product, so attempts to attach Image to a Product via Product SKU or Name.
204
-
205
- A report is generated in the current working directory detailing any Images in the paths that could not be matched with a Product.
206
-
207
- thor datashift:spree:images input=C:\images\product_images skip_if_no_assoc=true
208
-
209
- thor datashift:spree:images input=C:\images\taxon_icons skip_if_no_assoc=true klass=Taxon
180
+ easier and quicker to migrate your client's data into a Rails/ActiveRecord project,
181
+ without converting first to CSV or YAML.
210
182
 
211
- ## Import to Active Record
212
183
 
213
184
  ### Associations
214
185
 
@@ -237,7 +208,6 @@ During loading, a call to find_all_by_reference will be made, picking up the 2 c
237
208
  - Look at implementing import/export API using something like https://github.com/ianwhite/orm_adapter
238
209
  rather than active record, so we can support additional ORMs
239
210
 
240
- - Create separate Spree extension to support import/export via the admin gui
241
211
 
242
212
  ## License
243
213
 
data/VERSION CHANGED
@@ -1 +1 @@
1
- 0.13.0
1
+ 0.14.0
@@ -95,10 +95,14 @@ if(DataShift::Guards::jruby?)
95
95
 
96
96
  return create_sheet_and_set_styles( sheet_name )
97
97
  else
98
- if (@workbook.getSheetIndex(sheet_name) < 0) #Check sheet doesn't already exist
99
- return create_sheet_and_set_styles( sheet_name )
98
+
99
+ name = sanitize_sheet_name( sheet_name )
100
+
101
+ puts "WTF #{name}"
102
+ if (@workbook.getSheetIndex(name) < 0) #Check sheet doesn't already exist
103
+ return create_sheet_and_set_styles( name )
100
104
  else
101
- activate_sheet(sheet_name)
105
+ activate_sheet(name)
102
106
  end
103
107
  end
104
108
  end
@@ -205,9 +209,12 @@ if(DataShift::Guards::jruby?)
205
209
  private
206
210
 
207
211
  def create_sheet_and_set_styles( sheet_name )
208
- @sheet = @workbook.createSheet( sanitize_sheet_name(sheet_name) )
212
+
213
+ name = sanitize_sheet_name(sheet_name)
214
+
215
+ @sheet = @workbook.createSheet( name )
209
216
 
210
- @patriarchs.store(sheet_name, @sheet.createDrawingPatriarch())
217
+ @patriarchs.store(name, @sheet.createDrawingPatriarch())
211
218
 
212
219
  @date_style = @workbook.createCellStyle
213
220
  @date_style.setDataFormat( JExcelFile::date_format )
data/lib/datashift.rb CHANGED
@@ -32,21 +32,9 @@
32
32
  #
33
33
  # DataShift::load_commands
34
34
  #
35
- require 'rbconfig'
36
- require 'guards'
37
35
 
38
36
  module DataShift
39
37
 
40
- if(Guards::jruby?)
41
- require 'java'
42
-
43
- class Object
44
- def add_to_classpath(path)
45
- $CLASSPATH << File.join( DataShift.root_path, 'lib', path.gsub("\\", "/") )
46
- end
47
- end
48
- end
49
-
50
38
  def self.gem_version
51
39
  unless(@gem_version)
52
40
  if(File.exists?('VERSION'))
@@ -126,4 +114,21 @@ module DataShift
126
114
 
127
115
  end
128
116
 
129
- DataShift::require_libraries
117
+ DataShift::require_libraries
118
+
119
+ require 'datashift/guards'
120
+ require 'datashift/method_detail'
121
+ require 'datashift/method_dictionary'
122
+ require 'datashift/method_mapper'
123
+
124
+ module DataShift
125
+ if(Guards::jruby?)
126
+ require 'java'
127
+
128
+ class Object
129
+ def add_to_classpath(path)
130
+ $CLASSPATH << File.join( DataShift.root_path, 'lib', path.gsub("\\", "/") )
131
+ end
132
+ end
133
+ end
134
+ end
@@ -17,7 +17,6 @@ module DataShift
17
17
  # Support multiple associations being added to a base object to be specified in a single column.
18
18
  #
19
19
  # Entry represents the association to find via supplied name, value to use in the lookup.
20
- # Can contain multiple lookup name/value pairs, separated by multi_assoc_delim ( | )
21
20
  #
22
21
  # Default syntax :
23
22
  #
File without changes
@@ -20,10 +20,7 @@ module DataShift
20
20
  class MethodDetail
21
21
 
22
22
  include DataShift::Logging
23
-
24
- include DataShift::Populator
25
- extend DataShift::Populator
26
-
23
+
27
24
  def self.supported_types_enum
28
25
  @type_enum ||= Set[:assignment, :belongs_to, :has_one, :has_many]
29
26
  @type_enum
@@ -33,10 +30,7 @@ module DataShift
33
30
  @assoc_type_enum ||= Set[:belongs_to, :has_one, :has_many]
34
31
  @assoc_type_enum
35
32
  end
36
-
37
- # When looking up an association, try each of these in turn till a match
38
- # i.e find_by_name .. find_by_title and so on, lastly try the raw id
39
- @@insistent_find_by_list ||= [:name, :title, :id]
33
+
40
34
 
41
35
  # Name is the raw, client supplied name
42
36
  attr_accessor :name
@@ -115,60 +109,6 @@ module DataShift
115
109
  @operator_class
116
110
  end
117
111
 
118
- def assign(record, value )
119
-
120
- @current_value = value
121
-
122
- # logger.info("WARNING nil value supplied for Column [#{@name}]") if(@current_value.nil?)
123
-
124
- if( operator_for(:belongs_to) )
125
-
126
- #puts "DEBUG : BELONGS_TO : #{@name} : #{operator} - Lookup #{@current_value} in DB"
127
- insistent_belongs_to(record, @current_value)
128
-
129
- elsif( operator_for(:has_many) )
130
-
131
- #puts "DEBUG : VALUE TYPE [#{value.class.name.include?(operator.classify)}] [#{ModelMapper.class_from_string(value.class.name)}]" unless(value.is_a?(Array))
132
-
133
- # The include? check is best I can come up with right now .. to handle module/namespaces
134
- # TODO - can we determine the real class type of an association
135
- # e.g given a association taxons, which operator.classify gives us Taxon, but actually it's Spree::Taxon
136
- # so how do we get from 'taxons' to Spree::Taxons ? .. check if further info in reflect_on_all_associations
137
-
138
- if(value.is_a?(Array) || value.class.name.include?(operator.classify))
139
- record.send(operator) << value
140
- else
141
- puts "ERROR #{value.class} - Not expected type for has_many #{operator} - cannot assign"
142
- end
143
-
144
- elsif( operator_for(:has_one) )
145
-
146
- #puts "DEBUG : HAS_MANY : #{@name} : #{operator}(#{operator_class}) - Lookup #{@current_value} in DB"
147
- if(value.is_a?(operator_class))
148
- record.send(operator + '=', value)
149
- else
150
- logger.error("ERROR #{value.class} - Not expected type for has_one #{operator} - cannot assign")
151
- # TODO - Not expected type - maybe try to look it up somehow ?"
152
- #insistent_has_many(record, @current_value)
153
- end
154
-
155
- elsif( operator_for(:assignment) && @col_type )
156
- #puts "DEBUG : COl TYPE defined for #{@name} : #{@assignment} => #{@current_value} #{@col_type.type}"
157
- # puts "DEBUG : Column [#{@name}] : COl TYPE CAST: #{@current_value} => #{@col_type.type_cast( @current_value ).inspect}"
158
- record.send( operator + '=' , @col_type.type_cast( @current_value ) )
159
-
160
- #puts "DEBUG : MethodDetails Assignment RESULT: #{record.send(operator)}"
161
-
162
- elsif( operator_for(:assignment) )
163
- #puts "DEBUG : Column [#{@name}] : Brute force assignment of value #{@current_value}"
164
- # brute force case for assignments without a column type (which enables us to do correct type_cast)
165
- # so in this case, attempt straightforward assignment then if that fails, basic ops such as to_s, to_i, to_f etc
166
- insistent_assignment(record, @current_value)
167
- else
168
- puts "WARNING: No operator found for assignment on #{self.inspect} for Column [#{@name}]"
169
- end
170
- end
171
-
172
112
  def pp
173
113
  "#{@name} => #{operator}"
174
114
  end
@@ -222,12 +162,9 @@ module DataShift
222
162
  end
223
163
  end
224
164
  end
225
-
226
- def insistent_assignment( record, value )
227
- Populator::insistent_assignment( record, value, operator)
228
- end
229
165
 
230
- private
166
+ private
167
+
231
168
  # Return the operator's expected class, if can be derived, else nil
232
169
  def get_operator_class()
233
170
  if(operator_for(:has_many) || operator_for(:belongs_to) || operator_for(:has_one))
@@ -23,10 +23,15 @@ module DataShift
23
23
  end
24
24
 
25
25
  def add(method_details)
26
+ #puts "DEBUG: MGR - Add {#method_details.operator_type}\n#{method_details.inspect}"
26
27
  @method_details[method_details.operator_type.to_sym] ||= {}
28
+
29
+ # Mapped by Type and MethodDetail name
30
+ @method_details[method_details.operator_type.to_sym][method_details.name] = method_details
31
+
32
+ # Helper list of all available by type
27
33
  @method_details_list[method_details.operator_type.to_sym] ||= []
28
34
 
29
- @method_details[method_details.operator_type.to_sym][method_details.name] = method_details
30
35
  @method_details_list[method_details.operator_type.to_sym] << method_details
31
36
  @method_details_list[method_details.operator_type.to_sym].uniq!
32
37
  end
@@ -38,10 +43,11 @@ module DataShift
38
43
  def find(name, type)
39
44
  method_details = get(type)
40
45
 
41
- method_details ? method_details[name] : nil
46
+ method_details ? method_details[name] : nil
42
47
  end
43
48
 
44
- # type is expected to be one of MethodDetail::supportedtype_enum
49
+ # type is expected to be one of MethodDetail::supported_types_enum
50
+ # Returns all MethodDetail(s) for supplied type
45
51
  def get( type )
46
52
  @method_details[type.to_sym]
47
53
  end
@@ -51,9 +57,15 @@ module DataShift
51
57
  end
52
58
 
53
59
  alias_method(:get_list_of_method_details, :get_list)
54
-
55
- def get_operators( op_type )
56
- get_list(op_type).collect { |md| md.operator }
60
+
61
+ # Get list of the inbound or externally supplied names
62
+ def get_names(type)
63
+ get_list(type).collect { |md| md.name }
64
+ end
65
+
66
+ # Get list of Rails model operators
67
+ def get_operators(type)
68
+ get_list(type).collect { |md| md.operator }
57
69
  end
58
70
 
59
71
  alias_method(:get_list_of_operators, :get_list)
@@ -5,8 +5,6 @@
5
5
  #
6
6
  # Details:: A cache type class that stores details of all possible associations on AR classes.
7
7
  #
8
- require 'method_detail'
9
-
10
8
  module DataShift
11
9
 
12
10
  class MethodDictionary
@@ -23,7 +21,7 @@ module DataShift
23
21
  # grouped by type of association (includes belongs_to and has_many which provides both << and = )
24
22
  # Options:
25
23
  # :reload => clear caches and re-perform lookup
26
- # :instance_methods => if true include instance method type assignment operators as well as model's pure columns
24
+ # :instance_methods => if true include instance method type 'setters' as well as model's pure columns
27
25
  #
28
26
  def self.find_operators(klass, options = {} )
29
27
 
@@ -34,50 +32,39 @@ module DataShift
34
32
  has_many[klass] = klass.reflect_on_all_associations(:has_many).map { |i| i.name.to_s }
35
33
  klass.reflect_on_all_associations(:has_and_belongs_to_many).inject(has_many[klass]) { |x,i| x << i.name.to_s }
36
34
  end
37
-
38
- # puts "DEBUG: Has Many Associations:", has_many[klass].inspect
39
35
 
40
36
  # Find the belongs_to associations which can be populated via Model.belongs_to_name = OtherArModelObject
41
37
  if( options[:reload] || belongs_to[klass].nil? )
42
38
  belongs_to[klass] = klass.reflect_on_all_associations(:belongs_to).map { |i| i.name.to_s }
43
39
  end
44
40
 
45
- #puts "Belongs To Associations:", belongs_to[klass].inspect
46
-
47
41
  # Find the has_one associations which can be populated via Model.has_one_name = OtherArModelObject
48
42
  if( options[:reload] || has_one[klass].nil? )
49
43
  has_one[klass] = klass.reflect_on_all_associations(:has_one).map { |i| i.name.to_s }
50
44
  end
51
45
 
52
- #puts "has_one Associations:", self.has_one[klass].inspect
53
-
54
46
  # Find the model's column associations which can be populated via xxxxxx= value
55
47
  # Note, not all reflections return method names in same style so we convert all to
56
48
  # the raw form i.e without the '=' for consistency
57
49
  if( options[:reload] || assignments[klass].nil? )
58
50
 
59
- # TODO investigate difference with attribute_names - maybe column names can be assigned to an attribute
60
- # so in terms of method calls on klass attribute_names might be safer
61
51
  assignments[klass] = klass.column_names
52
+
53
+ # get into consistent format with other assignments names i.e remove the = for now
54
+ assignments[klass] += setters(klass).map{|i| i.gsub(/=/, '')} if(options[:instance_methods])
62
55
 
63
- if(options[:instance_methods] == true)
64
- setters = klass.instance_methods.grep(/\w+=/).collect {|x| x.to_s }
65
-
66
- # TODO - Since 3.2 this seems to return lots more stuff including validations which might not be appropriate
67
- if(klass.respond_to? :defined_activerecord_methods)
68
- setters = setters - klass.defined_activerecord_methods.to_a
69
- end
70
-
71
- # get into same format as other names
72
- assignments[klass] += setters.map{|i| i.gsub(/=/, '')}
73
- end
74
-
75
- assignments[klass] -= has_many[klass] if(has_many[klass])
56
+ # Now remove all the associations
57
+ assignments[klass] -= has_many[klass] if(has_many[klass])
76
58
  assignments[klass] -= belongs_to[klass] if(belongs_to[klass])
77
- assignments[klass] -= self.has_one[klass] if(self.has_one[klass])
78
-
59
+ assignments[klass] -= has_one[klass] if(has_one[klass])
60
+
61
+ # TODO remove assignments with id
62
+ # assignments => tax_id but already in belongs_to => tax
63
+
79
64
  assignments[klass].uniq!
80
65
 
66
+ #puts "\nDEBUG: DICT Setters\n#{assignments[klass]}\n"
67
+
81
68
  assignments[klass].each do |assign|
82
69
  column_types[klass] ||= {}
83
70
  column_def = klass.columns.find{ |col| col.name == assign }
@@ -86,6 +73,18 @@ module DataShift
86
73
  end
87
74
  end
88
75
 
76
+ def self.setters( klass )
77
+
78
+ # N.B In 1.8 these return strings, in 1.9 symbols.
79
+ # map everything to strings a
80
+ #setters = klass.accessible_attributes.sort.collect( &:to_s )
81
+
82
+ # remove methodsa that start with '_'
83
+ @keep_only_pure_setters ||= Regexp.new(/^[a-zA-Z]\w+=/)
84
+
85
+ setters = klass.instance_methods.grep(@keep_only_pure_setters).sort.collect( &:to_s )
86
+ setters.uniq
87
+ end
89
88
 
90
89
  def self.add( klass, operator, type = :assignment)
91
90
  method_details_mgr = get_method_details_mgr( klass )
@@ -119,14 +118,14 @@ module DataShift
119
118
  method_details_mgrs[klass] = method_details_mgr
120
119
 
121
120
  end
122
-
121
+
123
122
  # TODO - check out regexp to do this work better plus Inflections ??
124
123
  # Want to be able to handle any of ["Count On hand", 'count_on_hand', "Count OnHand", "COUNT ONHand" etc]
125
124
  def self.substitutions(external_name)
126
125
  name = external_name.to_s
127
126
 
128
127
  [
129
- name,
128
+ name.downcase,
130
129
  name.tableize,
131
130
  name.gsub(' ', '_'),
132
131
  name.gsub(' ', '_').downcase,
@@ -137,18 +136,24 @@ module DataShift
137
136
  ]
138
137
  end
139
138
 
139
+
140
140
  # Find the proper format of name, appropriate call + column type for a given name.
141
141
  # e.g Given users entry in spread sheet check for pluralization, missing underscores etc
142
142
  #
143
143
  # If not nil, returned method can be used directly in for example klass.new.send( call, .... )
144
144
  #
145
- def self.find_method_detail( klass, external_name )
145
+ def self.find_method_detail( klass, external_name, conditions = nil )
146
146
 
147
147
  method_details_mgr = get_method_details_mgr( klass )
148
-
149
- # md_mgr.all_available_operators.each { |l| puts "DEBUG: Mapped Method : #{l.inspect}" }
150
- substitutions(external_name).each do |n|
151
-
148
+
149
+ # first try for an exact match across all association types
150
+ MethodDetail::supported_types_enum.each do |t|
151
+ method_detail = method_details_mgr.find(external_name, t)
152
+ return method_detail.clone if(method_detail)
153
+ end
154
+
155
+ # Now try various alternatives of the name
156
+ substitutions(external_name).each do |n|
152
157
  # Try each association type, returning first that contains matching operator with name n
153
158
  MethodDetail::supported_types_enum.each do |t|
154
159
  method_detail = method_details_mgr.find(n, t)
@@ -165,14 +170,25 @@ module DataShift
165
170
 
166
171
  method_details_mgr = get_method_details_mgr( klass )
167
172
 
168
- substitutions(external_name).each do |n|
169
- method_detail = method_details_mgr.find(n, :assignment)
170
- return method_detail if(method_detail && method_detail.col_type)
173
+ # first try for an exact match across all association types
174
+ MethodDetail::supported_types_enum.each do |t|
175
+ method_detail = method_details_mgr.find(external_name, t)
176
+ return method_detail.clone if(method_detail && method_detail.col_type)
177
+ end
178
+
179
+ # Now try various alternatives
180
+ substitutions(external_name).each do |n|
181
+ # Try each association type, returning first that contains matching operator with name n
182
+ MethodDetail::supported_types_enum.each do |t|
183
+ method_detail = method_details_mgr.find(n, t)
184
+ return method_detail.clone if(method_detail && method_detail.col_type)
185
+ end
171
186
  end
172
-
187
+
173
188
  nil
174
189
  end
175
190
 
191
+
176
192
  def self.clear
177
193
  belongs_to.clear
178
194
  has_many.clear
@@ -190,7 +206,8 @@ module DataShift
190
206
  method_details_mgrs[klass] || MethodDetailsManager.new( klass )
191
207
  end
192
208
 
193
-
209
+
210
+ # Store a Mgr per mapped klass
194
211
  def self.method_details_mgrs
195
212
  @method_details_mgrs ||= {}
196
213
  @method_details_mgrs