crhym3-imexport 0.1.0 → 0.1.1
Sign up to get free protection for your applications and to get access to all the features.
- data/README.rdoc +216 -1
- data/Rakefile +1 -1
- data/imexport.gemspec +2 -2
- metadata +2 -2
data/README.rdoc
CHANGED
@@ -1,4 +1,219 @@
|
|
1
1
|
== Description
|
2
2
|
|
3
|
-
|
3
|
+
This library imports data from a text file (created by
|
4
|
+
mysql -E -e "SELECT something FROM somewhere, somewhere_else ..." > data.dump
|
5
|
+
only for the moment) into your rails app, meaning it only works with your
|
6
|
+
ActiveRecord models.
|
7
|
+
|
8
|
+
I had this little problem. There was an old web app written in java and a new
|
9
|
+
rails app. Although both were using mysql, "old" and "new" DBs were running on
|
10
|
+
separated servers and they had quite a different data structures. Still,
|
11
|
+
I wanted to keep some pieces of data synchronized quite frequently, at least
|
12
|
+
for a not-so-short transition period.
|
13
|
+
|
14
|
+
I had a couple options to consider:
|
15
|
+
|
16
|
+
* Simple "INSERT INTO new_DB (...) SELECT original_data FROM old_DB" or similar
|
17
|
+
to that. Cons: I couldn't do it in one sql expression as the data structures
|
18
|
+
were too different; I'd have to take care of attributes such as +updated_at+
|
19
|
+
and +created_at+ manually, let alone models validations.
|
20
|
+
|
21
|
+
* Create models for old data structures in the new rails app and do something
|
22
|
+
like this in a rake task:
|
23
|
+
|
24
|
+
OldModel.all do |old|
|
25
|
+
NewModel.create :attr1 => old.attr1, :attr2 => old...
|
26
|
+
end
|
27
|
+
|
28
|
+
Cons: I didn't want to mess up the new rails app with a bunch of models
|
29
|
+
I'd never use except for synchronization.
|
30
|
+
|
31
|
+
* Dump +original_data+ into a CSV format and the use +CSV+ or +FasterCSV+.
|
32
|
+
Actually this was the choice I opted for from the beginning. The problem
|
33
|
+
here was that I had a really messed up data sometimes: lots of copy&paste
|
34
|
+
from software like MS Word, etc. +FasterCSV+ was throwing Malformed exceptions
|
35
|
+
too often and +CSV+ sometimes wasn't able to recognize end of row / beginning
|
36
|
+
of a new row. It wasn't their fault, it was my data bad quality. So, I decided
|
37
|
+
to write this little gem.
|
38
|
+
|
39
|
+
== How it's different from CSV and FasterCSV
|
40
|
+
|
41
|
+
First off, this library isn't meant to replace either of them. It works with
|
42
|
+
different text formats (not CSV) and doesn't do just file parsing.
|
43
|
+
|
44
|
+
Consider this snippet created by
|
45
|
+
mysql -E -e "SELECT title AS COLUMN_title, speaker AS COLUMN_speaker, abstract AS COLUMN_abstract FROM Seminars" > seminars.txt:
|
46
|
+
|
47
|
+
*************************** 7. row ***************************
|
48
|
+
COLUMN_title: Conditional XPath = Codd Complete XPath
|
49
|
+
COLUMN_speaker: John Smith
|
50
|
+
COLUMN_abstract: This paper positively solves the following problem: Is there a natural
|
51
|
+
expansion of XPath 1.0 in which every first order query over
|
52
|
+
XML document tree models is expressible?
|
53
|
+
We give two necessary and sufficient conditions on XPath like
|
54
|
+
|
55
|
+
This library creates a new model object, recognizes each +COLUMN_attr+ and
|
56
|
+
tries to set attribute of that object, like model.title = COLUMN_title,
|
57
|
+
model.speaker = COLUMN_speaker and model.abstract = COLUMN_abstract.
|
58
|
+
|
59
|
+
It then runs model validations (model.valid?) and does either model.save or
|
60
|
+
model.update_attributes(attrs_hash).
|
61
|
+
|
62
|
+
== Usage
|
63
|
+
|
64
|
+
Say, you have a model called +Seminar+ with the following attributes:
|
65
|
+
|
66
|
+
create_table "seminars", :force => true do |t|
|
67
|
+
t.string "title",
|
68
|
+
t.text "abstract",
|
69
|
+
t.datetime "date",
|
70
|
+
t.text "notes"
|
71
|
+
t.datetime "created_at"
|
72
|
+
t.datetime "updated_at"
|
73
|
+
t.boolean "published"
|
74
|
+
end
|
75
|
+
|
76
|
+
Consider a snippet of a DB text dump similar to the previous example.
|
77
|
+
Let's just add few more columns:
|
78
|
+
|
79
|
+
*************************** 7. row ***************************
|
80
|
+
COLUMN_title: Conditional XPath = Codd Complete XPath
|
81
|
+
COLUMN_date_time: 2004-11-30T15:30:00
|
82
|
+
COLUMN_publish: 1
|
83
|
+
COLUMN_abstract: This paper positively solves the following problem: Is there a natural
|
84
|
+
expansion of XPath 1.0 in which every first order query over
|
85
|
+
XML document tree models is expressible?
|
86
|
+
|
87
|
+
Define a rake task in your rails app and require +imexport+, e.g.
|
88
|
+
|
89
|
+
namespace :db do
|
90
|
+
namespace :import do
|
91
|
+
task :seminars => :environment do
|
92
|
+
require imexport
|
93
|
+
end
|
94
|
+
end
|
95
|
+
end
|
96
|
+
|
97
|
+
Now, define columns-to-model-attributes:
|
98
|
+
|
99
|
+
COLUMNS_TO_MODEL_MAP = {
|
100
|
+
'date_time' => { :date => Proc.new do |datetime|
|
101
|
+
# YYYY-MM-DDTHH:MM:SS
|
102
|
+
DateTime.strptime(datetime, '%FT%T')
|
103
|
+
end },
|
104
|
+
'publish' => { :published => Proc.new do |val|
|
105
|
+
val.to_i == 1
|
106
|
+
end }
|
107
|
+
}
|
108
|
+
|
109
|
+
As you noticed we didn't define mapping for +title+ and +abstract+ as they
|
110
|
+
are simple strings and don't need any special conversion. Plus, column names
|
111
|
+
are the same as model attributes.
|
112
|
+
|
113
|
+
Lastly, let's do the sync:
|
114
|
+
|
115
|
+
ImExport::import(ENV['FROM_FILE'], {
|
116
|
+
:class_name => 'Seminar',
|
117
|
+
:find_by => 'title',
|
118
|
+
:db_columns_prefix => 'COLUMN_',
|
119
|
+
:map => COLUMNS_TO_MODEL_MAP})
|
120
|
+
|
121
|
+
You would run the task in this way:
|
122
|
+
|
123
|
+
rake db:import:seminars FROM_FILE=/path/to/seminars.txt
|
124
|
+
|
125
|
+
and your +seminars+ table is synchronized.
|
126
|
+
|
127
|
+
So, the complete rake task would look like this:
|
128
|
+
|
129
|
+
namespace :db do
|
130
|
+
namespace :import do
|
131
|
+
task :seminars => :environment do
|
132
|
+
require imexport
|
133
|
+
|
134
|
+
COLUMNS_TO_MODEL_MAP = {
|
135
|
+
'date_time' => { :date => Proc.new do |datetime|
|
136
|
+
# YYYY-MM-DDTHH:MM:SS
|
137
|
+
DateTime.strptime(datetime, '%FT%T')
|
138
|
+
end },
|
139
|
+
'publish' => { :published => Proc.new do |val|
|
140
|
+
val.to_i == 1
|
141
|
+
end }
|
142
|
+
}
|
143
|
+
|
144
|
+
ImExport::import(ENV['FROM_FILE'], {
|
145
|
+
:class_name => 'Seminar',
|
146
|
+
:find_by => 'title',
|
147
|
+
:db_columns_prefix => 'COLUMN_',
|
148
|
+
:map => COLUMNS_TO_MODEL_MAP})
|
149
|
+
end
|
150
|
+
end
|
151
|
+
end
|
152
|
+
|
153
|
+
Also, you can pass a block to ImExport::import. In that case you'll have to
|
154
|
+
call model.save or model.update_attributes(...) yourself:
|
155
|
+
|
156
|
+
ImExport::import(ENV['FROM_FILE'], {
|
157
|
+
:class_name => 'Seminar',
|
158
|
+
:find_by => 'title',
|
159
|
+
:db_columns_prefix => 'COLUMN_',
|
160
|
+
:map => COLUMNS_TO_MODEL_MAP}) do |seminar|
|
161
|
+
|
162
|
+
# do something with seminar object here, e.g.
|
163
|
+
# seminar.save
|
164
|
+
puts "---> #{seminar.inspect}"
|
165
|
+
end
|
166
|
+
|
167
|
+
=== Options for ImExport::import
|
168
|
+
|
169
|
+
|
170
|
+
+class_name+::
|
171
|
+
"String" or :symbol. ActiveRecord model defined in your rails app.
|
172
|
+
|
173
|
+
+find_by+::
|
174
|
+
"String" or :symbol.
|
175
|
+
This is how ImExport will recognize whether it should do model.save or
|
176
|
+
model.update_attributes(...). Considering previous example it would do
|
177
|
+
seminar.save if seminar.find_by_title(...) returns nil or
|
178
|
+
seminar.update_attributes(...) otherwise.
|
179
|
+
|
180
|
+
+db_columns_prefix+::
|
181
|
+
"String".
|
182
|
+
Column name prefix that should be skipped while looking for the corresponding
|
183
|
+
model attribute name. Againg, considering previous example, +COLUMN_title+
|
184
|
+
actually means +title+ attribute of +Seminar+ model.
|
185
|
+
|
186
|
+
+map+::
|
187
|
+
Hash.
|
188
|
+
Tells ImExport how to map column attributes with their corresponding model
|
189
|
+
attributes. Don't add +db_columns_prefix+ to the colum names here,
|
190
|
+
it is already cleaned up.
|
191
|
+
|
192
|
+
Also, you don't really have to define mapping for attributes that have the
|
193
|
+
same names as columns in the text file to be parsed, they will be recognized
|
194
|
+
and set automatically.
|
195
|
+
|
196
|
+
Every item in this Hash can be defined in one of the following ways:
|
197
|
+
|
198
|
+
'column_name' => :symbol
|
199
|
+
_Behavior_: model.symbol = value_of_column_name
|
200
|
+
|
201
|
+
'column_name' => { :symbol => Proc.new { |column_value| ... } }
|
202
|
+
_Behavior_: model.symbol = result_of_Proc_call where Proc's only argument is
|
203
|
+
the column value.
|
204
|
+
|
205
|
+
'column_name' => Proc.new { |column_value, model_object| ... }
|
206
|
+
_Behavior_: Proc called with two arguments, column value and object-to-be-saved itself.
|
207
|
+
This is the only case where your code should take of updating
|
208
|
+
model's attribute(s) since ImExport can't guess the attribute name.
|
209
|
+
|
210
|
+
== How to install
|
211
|
+
|
212
|
+
sudo gem install crhym3-imexport
|
213
|
+
|
214
|
+
== License
|
215
|
+
|
216
|
+
Copyright (c) 2009 Alex Vagin, released under the MIT license.
|
217
|
+
|
218
|
+
mailto:alex@digns.com
|
4
219
|
|
data/Rakefile
CHANGED
@@ -2,7 +2,7 @@ require 'rubygems'
|
|
2
2
|
require 'rake'
|
3
3
|
require 'echoe'
|
4
4
|
|
5
|
-
Echoe.new('imexport', '0.1.
|
5
|
+
Echoe.new('imexport', '0.1.1') do |p|
|
6
6
|
p.description = "Simple import from a text file generated by mysql -E ..."
|
7
7
|
p.url = "http://github.com/crhym3/imexport"
|
8
8
|
p.author = "alex"
|
data/imexport.gemspec
CHANGED
@@ -2,11 +2,11 @@
|
|
2
2
|
|
3
3
|
Gem::Specification.new do |s|
|
4
4
|
s.name = %q{imexport}
|
5
|
-
s.version = "0.1.
|
5
|
+
s.version = "0.1.1"
|
6
6
|
|
7
7
|
s.required_rubygems_version = Gem::Requirement.new(">= 1.2") if s.respond_to? :required_rubygems_version=
|
8
8
|
s.authors = ["alex"]
|
9
|
-
s.date = %q{2009-04-
|
9
|
+
s.date = %q{2009-04-22}
|
10
10
|
s.description = %q{Simple import from a text file generated by mysql -E ...}
|
11
11
|
s.email = %q{alex@digns.com}
|
12
12
|
s.extra_rdoc_files = ["CHANGELOG", "lib/imexport.rb", "README.rdoc"]
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: crhym3-imexport
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.1.
|
4
|
+
version: 0.1.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- alex
|
@@ -9,7 +9,7 @@ autorequire:
|
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
11
|
|
12
|
-
date: 2009-04-
|
12
|
+
date: 2009-04-22 00:00:00 -07:00
|
13
13
|
default_executable:
|
14
14
|
dependencies: []
|
15
15
|
|