RubyGems - crhym3-imexport - Versions diffs - 0.1.0 → 0.1.1 - Mend

crhym3-imexport 0.1.0 → 0.1.1

Files changed (4) hide show

data/README.rdoc CHANGED Viewed

@@ -1,4 +1,219 @@
 == Description
-TODO
+This library imports data from a text file (created by
+mysql -E -e "SELECT something FROM somewhere, somewhere_else ..." > data.dump
+only for the moment) into your rails app, meaning it only works with your
+ActiveRecord models.
+I had this little problem. There was an old web app written in java and a new
+rails app. Although both were using mysql, "old" and "new" DBs were running on
+separated servers and they had quite a different data structures. Still,
+I wanted to keep some pieces of data synchronized quite frequently, at least
+for a not-so-short transition period.
+I had a couple options to consider:
+* Simple "INSERT INTO new_DB (...) SELECT original_data FROM old_DB" or similar
+  to that. Cons: I couldn't do it in one sql expression as the data structures
+  were too different; I'd have to take care of attributes such as +updated_at+
+  and +created_at+ manually, let alone models validations.
+* Create models for old data structures in the new rails app and do something
+  like this in a rake task:
+     OldModel.all do |old|
+       NewModel.create :attr1 => old.attr1, :attr2 => old...
+     end
+  Cons: I didn't want to mess up the new rails app with a bunch of models
+  I'd never use except for synchronization.
+* Dump +original_data+ into a CSV format and the use +CSV+ or +FasterCSV+.
+  Actually this was the choice I opted for from the beginning. The problem
+  here was that I had a really messed up data sometimes: lots of copy&paste
+  from software like MS Word, etc. +FasterCSV+ was throwing Malformed exceptions
+  too often and +CSV+ sometimes wasn't able to recognize end of row / beginning
+  of a new row. It wasn't their fault, it was my data bad quality. So, I decided
+  to write this little gem.
+== How it's different from CSV and FasterCSV
+First off, this library isn't meant to replace either of them. It works with
+different text formats (not CSV) and doesn't do just file parsing.
+Consider this snippet created by
+mysql -E -e "SELECT title AS COLUMN_title, speaker AS COLUMN_speaker, abstract AS COLUMN_abstract FROM Seminars" > seminars.txt:
+  *************************** 7. row ***************************
+  COLUMN_title: Conditional XPath = Codd Complete XPath
+  COLUMN_speaker: John Smith
+  COLUMN_abstract: This paper positively solves the following problem: Is there a natural
+  expansion of XPath 1.0 in which  every first order query over
+  XML document tree models is expressible?
+  We give two necessary and sufficient conditions on XPath like
+This library creates a new model object, recognizes each +COLUMN_attr+ and
+tries to set attribute of that object, like model.title = COLUMN_title,
+model.speaker = COLUMN_speaker and model.abstract = COLUMN_abstract.
+It then runs model validations (model.valid?) and does either model.save or
+model.update_attributes(attrs_hash).
+== Usage
+Say, you have a model called +Seminar+ with the following attributes:
+  create_table "seminars", :force => true do |t|
+    t.string   "title",
+    t.text     "abstract",
+    t.datetime "date",
+    t.text     "notes"
+    t.datetime "created_at"
+    t.datetime "updated_at"
+    t.boolean  "published"
+  end
+Consider a snippet of a DB text dump similar to the previous example.
+Let's just add few more columns:
+  *************************** 7. row ***************************
+  COLUMN_title: Conditional XPath = Codd Complete XPath
+  COLUMN_date_time: 2004-11-30T15:30:00
+  COLUMN_publish: 1
+  COLUMN_abstract: This paper positively solves the following problem: Is there a natural
+  expansion of XPath 1.0 in which  every first order query over
+  XML document tree models is expressible?
+Define a rake task in your rails app and require +imexport+, e.g.
+  namespace :db do
+    namespace :import do
+      task :seminars => :environment do
+        require imexport
+      end
+    end
+  end
+Now, define columns-to-model-attributes:
+  COLUMNS_TO_MODEL_MAP = {
+    'date_time' => { :date => Proc.new do |datetime|
+                                # YYYY-MM-DDTHH:MM:SS
+                                DateTime.strptime(datetime, '%FT%T')
+                              end },
+    'publish' => { :published => Proc.new do |val|
+                                         val.to_i == 1
+                                 end }
+  }
+As you noticed we didn't define mapping for +title+ and +abstract+ as they
+are simple strings and don't need any special conversion. Plus, column names
+are the same as model attributes.
+Lastly, let's do the sync:
+  ImExport::import(ENV['FROM_FILE'], {
+                   :class_name        => 'Seminar',
+                   :find_by           => 'title',
+                   :db_columns_prefix => 'COLUMN_',
+                   :map               => COLUMNS_TO_MODEL_MAP})
+You would run the task in this way:
+  rake db:import:seminars FROM_FILE=/path/to/seminars.txt
+and your +seminars+ table is synchronized.
+So, the complete rake task would look like this:
+  namespace :db do
+    namespace :import do
+      task :seminars => :environment do
+        require imexport
+        COLUMNS_TO_MODEL_MAP = {
+          'date_time' => { :date => Proc.new do |datetime|
+                                      # YYYY-MM-DDTHH:MM:SS
+                                      DateTime.strptime(datetime, '%FT%T')
+                                    end },
+          'publish' => { :published => Proc.new do |val|
+                                               val.to_i == 1
+                                       end }
+        }
+        ImExport::import(ENV['FROM_FILE'], {
+                         :class_name        => 'Seminar',
+                         :find_by           => 'title',
+                         :db_columns_prefix => 'COLUMN_',
+                         :map               => COLUMNS_TO_MODEL_MAP})
+      end
+    end
+  end
+Also, you can pass a block to ImExport::import. In that case you'll have to
+call model.save or model.update_attributes(...) yourself:
+  ImExport::import(ENV['FROM_FILE'], {
+                   :class_name        => 'Seminar',
+                   :find_by           => 'title',
+                   :db_columns_prefix => 'COLUMN_',
+                   :map               => COLUMNS_TO_MODEL_MAP}) do |seminar|
+    # do something with seminar object here, e.g.
+    # seminar.save
+    puts "---> #{seminar.inspect}"
+  end
+=== Options for ImExport::import
++class_name+::
+  "String" or :symbol. ActiveRecord model defined in your rails app.
++find_by+::
+  "String" or :symbol.
+  This is how ImExport will recognize whether it should do model.save or
+  model.update_attributes(...). Considering previous example it would do
+  seminar.save if seminar.find_by_title(...) returns nil or
+  seminar.update_attributes(...) otherwise.
++db_columns_prefix+::
+  "String".
+  Column name prefix that should be skipped while looking for the corresponding
+  model attribute name. Againg, considering previous example, +COLUMN_title+
+  actually means +title+ attribute of +Seminar+ model.
++map+::
+  Hash.
+  Tells ImExport how to map column attributes with their corresponding model
+  attributes. Don't add +db_columns_prefix+ to the colum names here,
+  it is already cleaned up.
+  Also, you don't really have to define mapping for attributes that have the
+  same names as columns in the text file to be parsed, they will be recognized
+  and set automatically.
+  Every item in this Hash can be defined in one of the following ways:
+    'column_name' => :symbol
+  _Behavior_: model.symbol = value_of_column_name
+    'column_name' => { :symbol => Proc.new { |column_value| ... } }
+  _Behavior_: model.symbol = result_of_Proc_call where Proc's only argument is
+  the column value.
+    'column_name' => Proc.new { |column_value, model_object| ... }
+  _Behavior_: Proc called with two arguments, column value and object-to-be-saved itself.
+  This is the only case where your code should take of updating
+  model's attribute(s) since ImExport can't guess the attribute name.
+== How to install
+  sudo gem install crhym3-imexport
+== License
+Copyright (c) 2009 Alex Vagin, released under the MIT license.
+mailto:alex@digns.com

data/Rakefile CHANGED Viewed

@@ -2,7 +2,7 @@ require 'rubygems'
 require 'rake'
 require 'echoe'
-Echoe.new('imexport', '0.1.0') do |p|
+Echoe.new('imexport', '0.1.1') do |p|
   p.description     = "Simple import from a text file generated by mysql -E ..."
   p.url             = "http://github.com/crhym3/imexport"
   p.author          = "alex"

data/imexport.gemspec CHANGED Viewed

@@ -2,11 +2,11 @@
 Gem::Specification.new do |s|
   s.name = %q{imexport}
-  s.version = "0.1.0"
+  s.version = "0.1.1"
   s.required_rubygems_version = Gem::Requirement.new(">= 1.2") if s.respond_to? :required_rubygems_version=
   s.authors = ["alex"]
-  s.date = %q{2009-04-21}
+  s.date = %q{2009-04-22}
   s.description = %q{Simple import from a text file generated by mysql -E ...}
   s.email = %q{alex@digns.com}
   s.extra_rdoc_files = ["CHANGELOG", "lib/imexport.rb", "README.rdoc"]

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: crhym3-imexport
 version: !ruby/object:Gem::Version
-  version: 0.1.0
+  version: 0.1.1
 platform: ruby
 authors:
 - alex
@@ -9,7 +9,7 @@ autorequire:
 bindir: bin
 cert_chain: []
-date: 2009-04-21 00:00:00 -07:00
+date: 2009-04-22 00:00:00 -07:00
 default_executable:
 dependencies: []