data_transit 0.2.0
Sign up to get free protection for your applications and to get access to all the features.
- data/Database.yml +27 -0
- data/LICENSE +22 -0
- data/README +150 -0
- data/Rakefile +47 -0
- data/bin/data_transit +40 -0
- data/bin/data_transit.bat +6 -0
- data/lib/datatransit/clean_find_in_batches.rb +34 -0
- data/lib/datatransit/cli.rb +66 -0
- data/lib/datatransit/database.rb +21 -0
- data/lib/datatransit/helper.rb +26 -0
- data/lib/datatransit/model/tables_source.rb +18 -0
- data/lib/datatransit/model/tables_target.rb +18 -0
- data/lib/datatransit/model_dumper.rb +65 -0
- data/lib/datatransit/rule_dsl.rb +235 -0
- data/lib/datatransit/tasks.rb +71 -0
- metadata +66 -0
data/Database.yml
ADDED
@@ -0,0 +1,27 @@
|
|
1
|
+
source:
|
2
|
+
adapter: oracle_enhanced
|
3
|
+
host: 10.162.211.9
|
4
|
+
port: 1521
|
5
|
+
database: ztjc2
|
6
|
+
username: mw_ztjc
|
7
|
+
password: ztjc
|
8
|
+
|
9
|
+
target:
|
10
|
+
adapter: oracle_enhanced
|
11
|
+
host: 10.162.106.153
|
12
|
+
port: 1521
|
13
|
+
database: orcl
|
14
|
+
username: mw_app
|
15
|
+
password: mw_app
|
16
|
+
|
17
|
+
test_target:
|
18
|
+
adapter: sqlite3
|
19
|
+
database: development.sqlite3
|
20
|
+
pool: 5
|
21
|
+
timeout: 5000
|
22
|
+
|
23
|
+
production:
|
24
|
+
adapter: oracle
|
25
|
+
database: comics_catalog_production
|
26
|
+
username: comics_catalog
|
27
|
+
password:
|
data/LICENSE
ADDED
@@ -0,0 +1,22 @@
|
|
1
|
+
The MIT License (MIT)
|
2
|
+
|
3
|
+
Copyright (c) 2015 tianyuan
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
7
|
+
in the Software without restriction, including without limitation the rights
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
10
|
+
furnished to do so, subject to the following conditions:
|
11
|
+
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
13
|
+
copies or substantial portions of the Software.
|
14
|
+
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
21
|
+
SOFTWARE.
|
22
|
+
|
data/README
ADDED
@@ -0,0 +1,150 @@
|
|
1
|
+
data_transit
|
2
|
+
data_transit is a ruby gem/app used to migrate between different relational
|
3
|
+
databases, supporting customized migration procedure.
|
4
|
+
|
5
|
+
|
6
|
+
#1 Introduction
|
7
|
+
data_transit relies on activerecord to generate database Models on the fly.
|
8
|
+
Tt is executed within a database transaction, and should any error occur during
|
9
|
+
data transit, it will cause the transaction to rollback. So don't worry about
|
10
|
+
introducing dirty data into your target database.
|
11
|
+
|
12
|
+
|
13
|
+
#2 Install
|
14
|
+
data_transit can be installed using gem
|
15
|
+
gem install data_transit
|
16
|
+
or
|
17
|
+
download data_transit.gem
|
18
|
+
gem install --local /where/you/put/data_transit.gem
|
19
|
+
|
20
|
+
In your command line, input data_transit, You can proceed to "how to use" if you
|
21
|
+
can see message prompt as below.
|
22
|
+
data_transit
|
23
|
+
usage: data_transit command args. 4 commands are listed below...
|
24
|
+
|
25
|
+
|
26
|
+
#3 How to use
|
27
|
+
|
28
|
+
#3.1 Config DataBase Connection
|
29
|
+
data_transit setup_db_conn /your/yaml/database/config/file
|
30
|
+
|
31
|
+
Your db config file should be compatible with the activerecord adapter you are
|
32
|
+
using and configured properly. As the name suggests, source designates which
|
33
|
+
database you plan to copy data from, while target means the copy destination.
|
34
|
+
Note the key 'source' and 'target' should be kept unchanged!
|
35
|
+
For example, here is sample file for oracle dbms.
|
36
|
+
|
37
|
+
database.yml
|
38
|
+
source:#don't change this line
|
39
|
+
adapter: oracle_enhanced
|
40
|
+
database: dbserver
|
41
|
+
username: xxx
|
42
|
+
password: secret
|
43
|
+
|
44
|
+
target:#don't change this line
|
45
|
+
adapter: oracle_enhanced
|
46
|
+
database: orcl
|
47
|
+
username: copy1
|
48
|
+
password: cipher
|
49
|
+
|
50
|
+
|
51
|
+
#3.2 Create Database Schema(optional)
|
52
|
+
If you can have your target database schema created, move to 3.3 "copy data",
|
53
|
+
otherwise use your database specific tools to generate an identical schema in
|
54
|
+
your target database. Or if you don't have a handy tool to generate the schema,
|
55
|
+
for instance when you need to migrate between different database systems, you
|
56
|
+
can use data_transit to dump a schema description file based on your source
|
57
|
+
database schema, and then use this file to create your target schema.
|
58
|
+
|
59
|
+
Note this gem is built on activerecord, so it should work well for the database
|
60
|
+
schema compatible with rails conventions. Example, a single-column primary key
|
61
|
+
rather than a compound primary key (primary key with more than one columns),
|
62
|
+
primary key of integer type instead of other types like guid, timestamp etc.
|
63
|
+
|
64
|
+
In data_transit, I coded against the situation where non-integer is used,
|
65
|
+
therefore resulting a minor problem that the batch-query feature provided by
|
66
|
+
activerecord can not be used because of its dependency on integer primary key.
|
67
|
+
In this special case, careless selection of copy range might overburden network,
|
68
|
+
database server because all data in the specified range are transmitted from the
|
69
|
+
source database and then inserted into the target database.
|
70
|
+
|
71
|
+
|
72
|
+
#3.2.1 Dump Source Database Schema
|
73
|
+
data_transit dump_schema [schema_file] [rule_file]
|
74
|
+
[schema_file] will be used to contain dumped schema, [rule_file] describes how
|
75
|
+
your want data_transit to copy your data.
|
76
|
+
|
77
|
+
Note if your source schema is somewhat a legacy, you might need some manual work
|
78
|
+
to adjust the generated schema_file to better meet your needs.
|
79
|
+
|
80
|
+
For example, in my test, the source schema uses obj_id as id, and uses guid as
|
81
|
+
primary key type, so I need to impede activerecord auto-generating "id" column
|
82
|
+
by removing primary_key => "obj_id" and adding :id=>false, and then appending
|
83
|
+
primary key constraint in the end of each table definition. See below.
|
84
|
+
|
85
|
+
here is an example dumped schema file
|
86
|
+
|
87
|
+
ActiveRecord::Schema.define(:version => 0) do
|
88
|
+
create_table "table_one", primary_key => "obj_id", :force => true do |t|
|
89
|
+
#other fields
|
90
|
+
end
|
91
|
+
#other tables
|
92
|
+
end
|
93
|
+
|
94
|
+
and I manually changed the schema definition to
|
95
|
+
|
96
|
+
ActiveRecord::Schema.define(:version => 0) do
|
97
|
+
create_table "table_one", :id => false, :force => true do |t|
|
98
|
+
t.string "obj_id", :limit => 42
|
99
|
+
#other fields
|
100
|
+
end
|
101
|
+
execute "alter table table_one add primary key(obj_id)"
|
102
|
+
end
|
103
|
+
|
104
|
+
#3.2.2 Create Target Database Schema
|
105
|
+
data_transit create_table [schema_file]
|
106
|
+
If everything goes well, you will see a bunch of ddl execution history.
|
107
|
+
|
108
|
+
|
109
|
+
#3.3 Copy Data
|
110
|
+
data_transit copy_data [rule_file]
|
111
|
+
[rule_file] contains your copy logic. For security reasons, I changed table names
|
112
|
+
and it looks as follows.
|
113
|
+
|
114
|
+
|
115
|
+
start_date = "2015-01-01 00:00:00"
|
116
|
+
end_date = "2015-02-01 00:00:00"
|
117
|
+
|
118
|
+
migrate do
|
119
|
+
choose_table "APP.TABLE1","APP.TABLE2","APP.TABLE3","APP.TABLE4","APP.TABLE5","APP.TABLE6"
|
120
|
+
batch_by "ACQUISITION_TIME BETWEEN TO_DATE('#{start_date}','yyyy-mm-dd hh24:mi:ss') AND TO_DATE('#{end_date}', 'yyyy-mm-dd hh24:mi:ss')"
|
121
|
+
register_primary_key 'OBJ_ID'
|
122
|
+
end
|
123
|
+
|
124
|
+
migrate do
|
125
|
+
choose_table "APP.TABLE7","APP.TABLE8","APP.TABLE9","APP.TABLE10","APP.TABLE11","APP.TABLE12"
|
126
|
+
batch_by "ACKTIME BETWEEN TO_DATE('#{start_date}','yyyy-mm-dd hh24:mi:ss') AND TO_DATE('#{end_date}', 'yyyy-mm-dd hh24:mi:ss')"
|
127
|
+
register_primary_key 'OBJ_ID'
|
128
|
+
end
|
129
|
+
|
130
|
+
migrate do
|
131
|
+
choose_table "APP.TABLE13","APP.TABLE14","APP.TABLE15","APP.TABLE16","APP.TABLE17","APP.TABLE18"
|
132
|
+
batch_by "1>0" #query all data because these tables don't have a reasonable range
|
133
|
+
register_primary_key 'OBJ_ID'
|
134
|
+
pre_work do |targetCls| targetCls.delete_all("1>0") end #delete all in target
|
135
|
+
#post_work do |targetCls| end
|
136
|
+
end
|
137
|
+
|
138
|
+
|
139
|
+
Each migrate block contains a data_transit task.
|
140
|
+
|
141
|
+
"choose_table" describes which tables are included in this task. These tables
|
142
|
+
share some nature in common, and can be processed with the same rule.
|
143
|
+
|
144
|
+
"batch_by" is the query condition
|
145
|
+
|
146
|
+
"register_primary_key" describes the primary key of the tables.
|
147
|
+
|
148
|
+
"pre_work" is a block executed before each table is processed.
|
149
|
+
|
150
|
+
"post_work" is a block executed after each table is processed.
|
data/Rakefile
ADDED
@@ -0,0 +1,47 @@
|
|
1
|
+
#
|
2
|
+
# To change this license header, choose License Headers in Project Properties.
|
3
|
+
# To change this template file, choose Tools | Templates
|
4
|
+
# and open the template in the editor.
|
5
|
+
|
6
|
+
|
7
|
+
require 'rubygems'
|
8
|
+
require 'rake'
|
9
|
+
require 'rake/clean'
|
10
|
+
require 'rubygems/package_task'
|
11
|
+
require 'rdoc/task'
|
12
|
+
require 'rake/testtask'
|
13
|
+
|
14
|
+
spec = Gem::Specification.new do |s|
|
15
|
+
s.name = 'data_transit'
|
16
|
+
s.version = '0.2.0'
|
17
|
+
s.has_rdoc = true
|
18
|
+
s.extra_rdoc_files = ['README', 'LICENSE']
|
19
|
+
s.summary = 'a ruby gem/app used to migrate between databases, supporting customized migration procedure'
|
20
|
+
s.description = 'data_transit relies on activerecord to generate database Models on the fly. Tt is executed within a database transaction, and should any error occur during data transit, it will cause the transaction to rollback. So don\'t worry about introducing dirty data into your target database'
|
21
|
+
s.author = 'thundercumt'
|
22
|
+
s.email = 'thundercumt@126.com'
|
23
|
+
s.homepage = 'https://github.com/thundercumt/data_transit'
|
24
|
+
s.executables << "data_transit"
|
25
|
+
s.files = %w(LICENSE README Rakefile Database.yml) + Dir.glob("{bin,lib,spec}/**/*")
|
26
|
+
s.require_path = "lib"
|
27
|
+
s.bindir = "bin"
|
28
|
+
end
|
29
|
+
|
30
|
+
Gem::PackageTask.new(spec) do |p|
|
31
|
+
p.gem_spec = spec
|
32
|
+
p.need_tar = true
|
33
|
+
p.need_zip = true
|
34
|
+
end
|
35
|
+
|
36
|
+
Rake::RDocTask.new do |rdoc|
|
37
|
+
files =['README', 'LICENSE', 'lib/**/*.rb']
|
38
|
+
rdoc.rdoc_files.add(files)
|
39
|
+
rdoc.main = "README" # page to start on
|
40
|
+
rdoc.title = "Data_Transit Docs"
|
41
|
+
rdoc.rdoc_dir = 'doc/rdoc' # rdoc output folder
|
42
|
+
rdoc.options << '--line-numbers'
|
43
|
+
end
|
44
|
+
|
45
|
+
Rake::TestTask.new do |t|
|
46
|
+
t.test_files = FileList['test/**/*.rb']
|
47
|
+
end
|
data/bin/data_transit
ADDED
@@ -0,0 +1,40 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
|
3
|
+
require File::expand_path('../../lib/datatransit/cli', __FILE__)
|
4
|
+
|
5
|
+
command = ARGV[0]
|
6
|
+
|
7
|
+
help_msg = "usage: data_transit command args. 4 commands are listed below\n
|
8
|
+
1 data_transit setup_db_conn [conn.yml]
|
9
|
+
conn.yml: the yml file describing db connection and adapters\n
|
10
|
+
2 data_transit dump_schema [schema_file] [rule_file]
|
11
|
+
schema_file: the file to store source database schema
|
12
|
+
rule_file: the user migration rule\n
|
13
|
+
3 data_transit create_table [schema_file]
|
14
|
+
schema_file: the file generated by dump_schema\n
|
15
|
+
4 data_transit copy_data [rule_file]
|
16
|
+
rule_file: the user migration rule"
|
17
|
+
|
18
|
+
case command
|
19
|
+
when "setup_db_conn"
|
20
|
+
yml = ARGV[1]
|
21
|
+
if File::exists?(yml)
|
22
|
+
setup_db_conn yml
|
23
|
+
else
|
24
|
+
print "file #{yml} does not exist!\n"
|
25
|
+
end
|
26
|
+
|
27
|
+
when "dump_schema"
|
28
|
+
schema = ARGV[1]
|
29
|
+
rule = ARGV[2]
|
30
|
+
if not (schema =~ /\.rb$/)
|
31
|
+
schema += ".rb"
|
32
|
+
end
|
33
|
+
dump_schema schema, rule
|
34
|
+
when "create_table"
|
35
|
+
create_tables ARGV[1]
|
36
|
+
when "copy_data"
|
37
|
+
copy_data ARGV[1]
|
38
|
+
else
|
39
|
+
puts help_msg
|
40
|
+
end
|
@@ -0,0 +1,34 @@
|
|
1
|
+
module CleanFindInBatches
|
2
|
+
|
3
|
+
def self.included(base)
|
4
|
+
base.class_eval do
|
5
|
+
alias :old_find_in_batches :find_in_batches
|
6
|
+
alias :find_in_batches :replacement_find_in_batches
|
7
|
+
end
|
8
|
+
end
|
9
|
+
|
10
|
+
# Override due to implementation of regular find_in_batches
|
11
|
+
# conflicting using UUIDs
|
12
|
+
def replacement_find_in_batches(options = {}, &block)
|
13
|
+
relation = self
|
14
|
+
return old_find_in_batches(options, &block) if relation.primary_key.is_a?(Arel::Attributes::Integer)
|
15
|
+
# Throw errors like the real thing
|
16
|
+
if (finder_options = options.except(:batch_size)).present?
|
17
|
+
raise "You can't specify an order, it's forced to be #{batch_order}" if options[:order].present?
|
18
|
+
raise "You can't specify a limit, it's forced to be the batch_size" if options[:limit].present?
|
19
|
+
raise 'You can\'t specify start, it\'s forced to be 0 because the ID is a string' if options.delete(:start)
|
20
|
+
relation = apply_finder_options(finder_options)
|
21
|
+
end
|
22
|
+
# Compute the batch size
|
23
|
+
batch_size = options.delete(:batch_size) || 1000
|
24
|
+
offset = 0
|
25
|
+
# Get the relation and keep going over it until there's nothing left
|
26
|
+
relation = relation.except(:order).order(batch_order).limit(batch_size)
|
27
|
+
while (results = relation.offset(offset).limit(batch_size).all).any?
|
28
|
+
block.call results
|
29
|
+
offset += batch_size
|
30
|
+
end
|
31
|
+
nil
|
32
|
+
end
|
33
|
+
|
34
|
+
end
|
@@ -0,0 +1,66 @@
|
|
1
|
+
# To change this license header, choose License Headers in Project Properties.
|
2
|
+
# To change this template file, choose Tools | Templates
|
3
|
+
# and open the template in the editor.
|
4
|
+
|
5
|
+
#require 'rubygems'
|
6
|
+
#require 'bundler/setup'
|
7
|
+
#Bundler.require
|
8
|
+
|
9
|
+
#initial code
|
10
|
+
ENV['NLS_LANG'] = 'SIMPLIFIED CHINESE_CHINA.ZHS16GBK' if ENV['NLS_LANG'] == nil
|
11
|
+
# It is recommended to set time zone in TZ environment variable so that the same timezone will be used by Ruby and by Oracle session
|
12
|
+
ENV['TZ'] = 'UTC'
|
13
|
+
|
14
|
+
require 'active_record'
|
15
|
+
require File::expand_path('../clean_find_in_batches', __FILE__)
|
16
|
+
ActiveRecord::Batches.send(:include, CleanFindInBatches)
|
17
|
+
|
18
|
+
require File::expand_path('../database', __FILE__)
|
19
|
+
require File::expand_path('../model/tables_source', __FILE__)
|
20
|
+
require File::expand_path('../model/tables_target', __FILE__)
|
21
|
+
require File::expand_path('../model_dumper', __FILE__)
|
22
|
+
require File::expand_path('../rule_dsl', __FILE__)
|
23
|
+
|
24
|
+
def dump_schema schema_file, rule_file
|
25
|
+
puts "preparing to generate schema.rb, the schema of source database\n"
|
26
|
+
ActiveRecord::Base.establish_connection(DataTransit::Database.source)
|
27
|
+
#tables = DataTransit::Database.tables
|
28
|
+
worker = DTWorker.new rule_file
|
29
|
+
worker.load_work
|
30
|
+
tables = worker.get_all_tables
|
31
|
+
|
32
|
+
print tables, "\n"
|
33
|
+
|
34
|
+
dumper = DataTransit::ModelDumper.new tables
|
35
|
+
dumper.dump_tables File.open(schema_file, 'w');
|
36
|
+
ActiveRecord::Base.remove_connection
|
37
|
+
puts "schema.rb generated\n"
|
38
|
+
end
|
39
|
+
|
40
|
+
def create_tables schema_file
|
41
|
+
puts "\npreparing to create tables in the target database\n"
|
42
|
+
ActiveRecord::Base.establish_connection(DataTransit::Database.target)
|
43
|
+
require schema_file
|
44
|
+
ActiveRecord::Base.remove_connection
|
45
|
+
puts "tables created in target database\n"
|
46
|
+
end
|
47
|
+
|
48
|
+
|
49
|
+
def copy_data rule_file
|
50
|
+
worker = DTWorker.new rule_file
|
51
|
+
worker.load_work
|
52
|
+
worker.do_work
|
53
|
+
end
|
54
|
+
|
55
|
+
def setup_db_conn db_yml
|
56
|
+
if File::exists? db_yml
|
57
|
+
src = File::open(db_yml, "r")
|
58
|
+
dst = File::open( File.expand_path('../../../database.yml', __FILE__), "w" )
|
59
|
+
|
60
|
+
src.each_line { |line|
|
61
|
+
dst.write line
|
62
|
+
}
|
63
|
+
src.close
|
64
|
+
dst.close
|
65
|
+
end
|
66
|
+
end
|
@@ -0,0 +1,21 @@
|
|
1
|
+
# To change this license header, choose License Headers in Project Properties.
|
2
|
+
# To change this template file, choose Tools | Templates
|
3
|
+
# and open the template in the editor.
|
4
|
+
|
5
|
+
module DataTransit
|
6
|
+
|
7
|
+
class Database
|
8
|
+
|
9
|
+
@@dbconfig = YAML::load( File.open( File.expand_path('../../../database.yml', __FILE__) ) )
|
10
|
+
|
11
|
+
def self.source
|
12
|
+
@@dbconfig['source']
|
13
|
+
end
|
14
|
+
|
15
|
+
def self.target
|
16
|
+
@@dbconfig['target']
|
17
|
+
end
|
18
|
+
|
19
|
+
end
|
20
|
+
|
21
|
+
end
|
@@ -0,0 +1,26 @@
|
|
1
|
+
# To change this license header, choose License Headers in Project Properties.
|
2
|
+
# To change this template file, choose Tools | Templates
|
3
|
+
# and open the template in the editor.
|
4
|
+
|
5
|
+
module TimeExt
|
6
|
+
def fly_by_day days
|
7
|
+
Time.at(self.to_i + days * 24 * 60 * 60)
|
8
|
+
end
|
9
|
+
|
10
|
+
def fly_by_week weeks
|
11
|
+
Time.at(self.to_i + weeks * 7 * 24 * 60 * 60)
|
12
|
+
end
|
13
|
+
|
14
|
+
def fly_by_month months
|
15
|
+
Time.at(self.to_i + months * 30 * 24 * 60 * 60)
|
16
|
+
end
|
17
|
+
|
18
|
+
def midnight
|
19
|
+
Time.at(self.year, self.month, self.day, 24, 0, 0 )
|
20
|
+
end
|
21
|
+
|
22
|
+
end
|
23
|
+
|
24
|
+
class Time
|
25
|
+
include TimeExt
|
26
|
+
end
|
@@ -0,0 +1,18 @@
|
|
1
|
+
# To change this license header, choose License Headers in Project Properties.
|
2
|
+
# To change this template file, choose Tools | Templates
|
3
|
+
# and open the template in the editor.
|
4
|
+
|
5
|
+
require 'active_record'
|
6
|
+
require File::expand_path('../../database', __FILE__)
|
7
|
+
|
8
|
+
module DataTransit
|
9
|
+
|
10
|
+
module Source
|
11
|
+
|
12
|
+
class SourceBase < ActiveRecord::Base
|
13
|
+
self.abstract_class = true
|
14
|
+
establish_connection DataTransit::Database.source
|
15
|
+
end
|
16
|
+
end
|
17
|
+
|
18
|
+
end
|
@@ -0,0 +1,18 @@
|
|
1
|
+
# To change this license header, choose License Headers in Project Properties.
|
2
|
+
# To change this template file, choose Tools | Templates
|
3
|
+
# and open the template in the editor.
|
4
|
+
|
5
|
+
require 'active_record'
|
6
|
+
require File::expand_path('../../database', __FILE__)
|
7
|
+
|
8
|
+
module DataTransit
|
9
|
+
|
10
|
+
module Target
|
11
|
+
|
12
|
+
class TargetBase < ActiveRecord::Base
|
13
|
+
self.abstract_class = true
|
14
|
+
establish_connection DataTransit::Database.target
|
15
|
+
end
|
16
|
+
end
|
17
|
+
|
18
|
+
end
|
@@ -0,0 +1,65 @@
|
|
1
|
+
# To change this license header, choose License Headers in Project Properties.
|
2
|
+
# To change this template file, choose Tools | Templates
|
3
|
+
# and open the template in the editor.
|
4
|
+
|
5
|
+
module DataTransit
|
6
|
+
class GivenTableDumper < ActiveRecord::SchemaDumper
|
7
|
+
|
8
|
+
def self.give_tables tables
|
9
|
+
@@tables = tables
|
10
|
+
end
|
11
|
+
|
12
|
+
def dump(stream)
|
13
|
+
=begin
|
14
|
+
if all_tables_exist?
|
15
|
+
header(stream)
|
16
|
+
tables(stream)
|
17
|
+
trailer(stream)
|
18
|
+
else
|
19
|
+
print "No schema generated, because some table[s] do not exist!\n"
|
20
|
+
end
|
21
|
+
=end
|
22
|
+
header(stream)
|
23
|
+
tables(stream)
|
24
|
+
trailer(stream)
|
25
|
+
|
26
|
+
stream
|
27
|
+
end
|
28
|
+
|
29
|
+
def tables(stream)
|
30
|
+
@@tables.each do |tbl|
|
31
|
+
table(tbl, stream)
|
32
|
+
end
|
33
|
+
end
|
34
|
+
|
35
|
+
private
|
36
|
+
def all_tables_exist?
|
37
|
+
all_tables = @connection.tables
|
38
|
+
print all_tables, '!!!!!!!!!!!!!!!!!!!!111'
|
39
|
+
tables_all_exist = true
|
40
|
+
|
41
|
+
@@tables.each do |tbl|
|
42
|
+
unless all_tables.include? tbl
|
43
|
+
print "table [", tbl, "] doesn't exist!\n"
|
44
|
+
tables_all_exist = false
|
45
|
+
end
|
46
|
+
end
|
47
|
+
|
48
|
+
tables_all_exist
|
49
|
+
end
|
50
|
+
end
|
51
|
+
|
52
|
+
|
53
|
+
class ModelDumper
|
54
|
+
|
55
|
+
def initialize table_list
|
56
|
+
@tables = table_list
|
57
|
+
end
|
58
|
+
|
59
|
+
def dump_tables (stream)
|
60
|
+
GivenTableDumper.give_tables @tables
|
61
|
+
GivenTableDumper.dump ActiveRecord::Base.connection, stream
|
62
|
+
end
|
63
|
+
end
|
64
|
+
|
65
|
+
end
|
@@ -0,0 +1,235 @@
|
|
1
|
+
# To change this license header, choose License Headers in Project Properties.
|
2
|
+
# To change this template file, choose Tools | Templates
|
3
|
+
# and open the template in the editor.
|
4
|
+
|
5
|
+
#chain_filter1 --> chain_filter2 --> chain_filter3
|
6
|
+
|
7
|
+
require 'singleton'
|
8
|
+
require 'progress_bar'
|
9
|
+
|
10
|
+
#require File::expand_path('../database', __FILE__)
|
11
|
+
require File::expand_path('../model/tables_source', __FILE__)
|
12
|
+
require File::expand_path('../model/tables_target', __FILE__)
|
13
|
+
|
14
|
+
module RULE_DSL
|
15
|
+
|
16
|
+
class Task
|
17
|
+
attr_accessor :tables, :search_cond, :pk, :pre_work, :post_work, :filter
|
18
|
+
end
|
19
|
+
|
20
|
+
class TaskSet
|
21
|
+
def initialize
|
22
|
+
@tasks = []
|
23
|
+
end
|
24
|
+
|
25
|
+
def push_back task
|
26
|
+
@tasks << task
|
27
|
+
end
|
28
|
+
|
29
|
+
def pop_head
|
30
|
+
@tasks.delete_at 0
|
31
|
+
end
|
32
|
+
|
33
|
+
def first_task
|
34
|
+
@tasks[0]
|
35
|
+
end
|
36
|
+
|
37
|
+
def last_task
|
38
|
+
@tasks[@tasks.length - 1]
|
39
|
+
end
|
40
|
+
|
41
|
+
def length
|
42
|
+
@tasks.length
|
43
|
+
end
|
44
|
+
|
45
|
+
def task_at idx
|
46
|
+
@tasks[idx]
|
47
|
+
end
|
48
|
+
|
49
|
+
def each
|
50
|
+
for i in 0 .. @tasks.length-1
|
51
|
+
yield @tasks[i]
|
52
|
+
end
|
53
|
+
end
|
54
|
+
end
|
55
|
+
|
56
|
+
def load_rule *rule_file
|
57
|
+
rule_file.each{ |file| self.instance_eval File.read(file), file }
|
58
|
+
end
|
59
|
+
|
60
|
+
def migrate &script
|
61
|
+
@taskset ||= TaskSet.new
|
62
|
+
task = Task.new
|
63
|
+
@taskset.push_back task
|
64
|
+
yield
|
65
|
+
end
|
66
|
+
|
67
|
+
def choose_table *tables
|
68
|
+
task = @taskset.last_task
|
69
|
+
task.tables = tables
|
70
|
+
end
|
71
|
+
|
72
|
+
def batch_by search_cond
|
73
|
+
task = @taskset.last_task
|
74
|
+
task.search_cond = search_cond
|
75
|
+
end
|
76
|
+
|
77
|
+
def register_primary_key *key
|
78
|
+
task = @taskset.last_task
|
79
|
+
task.pk = key.map(&:downcase)
|
80
|
+
end
|
81
|
+
|
82
|
+
def filter_out_with &filter
|
83
|
+
task = @taskset.last_task
|
84
|
+
task.filter = filter
|
85
|
+
end
|
86
|
+
|
87
|
+
def pre_work &pre
|
88
|
+
task = @taskset.last_task
|
89
|
+
task.pre_work = pre
|
90
|
+
end
|
91
|
+
|
92
|
+
def post_work &post
|
93
|
+
task = @taskset.last_task
|
94
|
+
task.post_work = post
|
95
|
+
end
|
96
|
+
|
97
|
+
def get_all_tables
|
98
|
+
tables = []
|
99
|
+
@taskset.each { |task| tables << task.tables }
|
100
|
+
tables.flatten!
|
101
|
+
end
|
102
|
+
end
|
103
|
+
|
104
|
+
class DTWorker
|
105
|
+
include RULE_DSL
|
106
|
+
attr_accessor :batch_size
|
107
|
+
|
108
|
+
def initialize rule_file
|
109
|
+
@rule_file = rule_file
|
110
|
+
@batch_size = 500
|
111
|
+
end
|
112
|
+
|
113
|
+
def load_work
|
114
|
+
load_rule @rule_file
|
115
|
+
end
|
116
|
+
|
117
|
+
def do_work
|
118
|
+
#we copy all or nothing. a mess is not welcome here
|
119
|
+
#on failures, transaction will rollback automatically
|
120
|
+
DataTransit::Target::TargetBase.transaction do
|
121
|
+
@taskset.each do |task|
|
122
|
+
do_task task
|
123
|
+
end
|
124
|
+
end
|
125
|
+
end
|
126
|
+
|
127
|
+
def do_task task
|
128
|
+
pks = task.pk#a context-free-variable in context switches
|
129
|
+
tables = task.tables
|
130
|
+
|
131
|
+
tables.each do |tbl|
|
132
|
+
sourceCls = Class.new(DataTransit::Source::SourceBase) do
|
133
|
+
self.table_name = tbl
|
134
|
+
end
|
135
|
+
|
136
|
+
#columns = sourceCls.columns.map(&:name).map(&:downcase)
|
137
|
+
columns = sourceCls.columns
|
138
|
+
pk_column = get_pk_column(columns, pks)
|
139
|
+
if pk_column != nil
|
140
|
+
pk, pk_type = pk_column.name, pk_column.type
|
141
|
+
else
|
142
|
+
pk, pk_type = nil, nil
|
143
|
+
end
|
144
|
+
|
145
|
+
sourceCls.instance_eval( "self.primary_key = \"#{pk}\"") if pk != nil
|
146
|
+
|
147
|
+
targetCls = Class.new(DataTransit::Target::TargetBase) do
|
148
|
+
self.table_name = tbl
|
149
|
+
end
|
150
|
+
targetCls.instance_eval( "self.primary_key = \"#{pk}\"") if pk != nil
|
151
|
+
|
152
|
+
print "\ntable ", tbl, ":\n"
|
153
|
+
do_user_ar_proc targetCls, task.pre_work if task.pre_work
|
154
|
+
do_batch_copy sourceCls, targetCls, task, pk, pk_type=="integer"
|
155
|
+
do_user_ar_proc targetCls, task.post_work if task.post_work
|
156
|
+
end
|
157
|
+
end
|
158
|
+
|
159
|
+
def do_user_ar_proc targetCls, proc
|
160
|
+
proc.call targetCls
|
161
|
+
end
|
162
|
+
|
163
|
+
def do_batch_copy (sourceCls, targetCls, task, pk = nil, in_batch = false)
|
164
|
+
count = sourceCls.where(task.search_cond).size.to_f
|
165
|
+
return if count <= 0
|
166
|
+
|
167
|
+
how_many_batch = (count / @batch_size).ceil
|
168
|
+
#the progress bar
|
169
|
+
bar = ProgressBar.new(count)
|
170
|
+
|
171
|
+
if in_batch
|
172
|
+
0.upto (how_many_batch-1) do |i|
|
173
|
+
sourceCls.where(task.search_cond).find_each(
|
174
|
+
start: i * @batch_size, batch_size: @batch_size) do |source_row|
|
175
|
+
|
176
|
+
#update progress
|
177
|
+
bar.increment!
|
178
|
+
|
179
|
+
if task.filter
|
180
|
+
next if do_filter_out task.filter, source_row
|
181
|
+
end
|
182
|
+
target_row = targetCls.new source_row.attributes
|
183
|
+
|
184
|
+
#activerecord would ignore pk field, and the above initialization will result nill primary key.
|
185
|
+
#here the original pk is used in the target_row, it is what we need exactly.
|
186
|
+
if pk
|
187
|
+
target_row.send( "#{pk}=", source_row.send("#{pk}") )
|
188
|
+
end
|
189
|
+
|
190
|
+
target_row.save
|
191
|
+
end
|
192
|
+
end
|
193
|
+
else
|
194
|
+
sourceCls.where(task.search_cond).each do |source_row|
|
195
|
+
#update progress
|
196
|
+
bar.increment!
|
197
|
+
|
198
|
+
if task.filter
|
199
|
+
next if do_filter_out task.filter, source_row
|
200
|
+
end
|
201
|
+
target_row = targetCls.new source_row.attributes
|
202
|
+
|
203
|
+
#activerecord would ignore pk field, and the above initialization will result nill primary key.
|
204
|
+
#here the original pk is used in the target_row, it is what we need exactly.
|
205
|
+
if pk
|
206
|
+
target_row.send( "#{pk}=", source_row.send("#{pk}") )
|
207
|
+
end
|
208
|
+
|
209
|
+
target_row.save
|
210
|
+
end
|
211
|
+
end
|
212
|
+
|
213
|
+
|
214
|
+
end
|
215
|
+
|
216
|
+
def do_filter_out filter, row
|
217
|
+
if filter.call row
|
218
|
+
return true
|
219
|
+
end
|
220
|
+
false
|
221
|
+
end
|
222
|
+
|
223
|
+
private
|
224
|
+
def get_pk_column(columns, given_pk)
|
225
|
+
column_names = columns.map(&:name).map(&:downcase)
|
226
|
+
pk = column_names & given_pk
|
227
|
+
if pk && pk.length > 0
|
228
|
+
pk = pk[0]
|
229
|
+
return columns[column_names.index(pk)]
|
230
|
+
else
|
231
|
+
return nil
|
232
|
+
end
|
233
|
+
end
|
234
|
+
|
235
|
+
end
|
@@ -0,0 +1,71 @@
|
|
1
|
+
# To change this license header, choose License Headers in Project Properties.
|
2
|
+
# To change this template file, choose Tools | Templates
|
3
|
+
# and open the template in the editor.
|
4
|
+
require 'rubygems'
|
5
|
+
require 'bundler/setup'
|
6
|
+
Bundler.require
|
7
|
+
|
8
|
+
#initial code
|
9
|
+
ENV['NLS_LANG'] = 'SIMPLIFIED CHINESE_CHINA.ZHS16GBK' if ENV['NLS_LANG'] == nil
|
10
|
+
# It is recommended to set time zone in TZ environment variable so that the same timezone will be used by Ruby and by Oracle session
|
11
|
+
ENV['TZ'] = 'UTC'
|
12
|
+
|
13
|
+
|
14
|
+
namespace :db do
|
15
|
+
|
16
|
+
desc "environment verification"
|
17
|
+
task :environment do
|
18
|
+
require 'active_record'
|
19
|
+
#require 'activerecord-oracle_enhanced-adapter'
|
20
|
+
#require 'oci8'
|
21
|
+
|
22
|
+
require File::expand_path('../database', __FILE__)
|
23
|
+
require File::expand_path('../model/tables_source', __FILE__)
|
24
|
+
require File::expand_path('../model/tables_target', __FILE__)
|
25
|
+
require File::expand_path('../model_dumper', __FILE__)
|
26
|
+
end
|
27
|
+
|
28
|
+
desc "dsl related definitions"
|
29
|
+
task :dsl => :environment do
|
30
|
+
require File::expand_path('../rule_dsl', __FILE__)
|
31
|
+
end
|
32
|
+
|
33
|
+
desc "generate schema.rb, the schema of source database"
|
34
|
+
task :dump_schema => :dsl do
|
35
|
+
puts "preparing to generate schema.rb, the schema of source database\n"
|
36
|
+
ActiveRecord::Base.establish_connection(DataTransit::Database.source)
|
37
|
+
#tables = DataTransit::Database.tables
|
38
|
+
worker = DTWorker.new File::expand_path('../../rule.rb', __FILE__)
|
39
|
+
worker.load_work
|
40
|
+
tables = worker.tables
|
41
|
+
|
42
|
+
print tables, "\n"
|
43
|
+
|
44
|
+
dumper = DataTransit::ModelDumper.new tables
|
45
|
+
dumper.dump_tables File.open(File::expand_path('../schema.rb', __FILE__), 'w');
|
46
|
+
ActiveRecord::Base.remove_connection
|
47
|
+
puts "schema.rb generated\n"
|
48
|
+
end
|
49
|
+
|
50
|
+
desc "use schema.rb to schema created in target database"
|
51
|
+
task :create_tables => :environment do
|
52
|
+
puts "\npreparing to create tables in the target database\n"
|
53
|
+
ActiveRecord::Base.establish_connection(DataTransit::Database.target)
|
54
|
+
require File.expand_path("../schema", __FILE__)
|
55
|
+
ActiveRecord::Base.remove_connection
|
56
|
+
puts "tables created in target database\n"
|
57
|
+
end
|
58
|
+
|
59
|
+
desc "data transit, copy rows from source db tables to target db tables\n it supports incremental copy by additional arguments"
|
60
|
+
task :copy_data, [] => :dsl do
|
61
|
+
worker = DTWorker.new File::expand_path('../../rule.rb', __FILE__)
|
62
|
+
worker.do_work
|
63
|
+
end
|
64
|
+
|
65
|
+
#task :my, [:arg1, :arg2, :arg3, :arg4] => :environment do |t, args|
|
66
|
+
#print t, args
|
67
|
+
#tables = DataTransit::Database.tables
|
68
|
+
#print tables
|
69
|
+
#end
|
70
|
+
|
71
|
+
end
|
metadata
ADDED
@@ -0,0 +1,66 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: data_transit
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.2.0
|
5
|
+
prerelease:
|
6
|
+
platform: ruby
|
7
|
+
authors:
|
8
|
+
- thundercumt
|
9
|
+
autorequire:
|
10
|
+
bindir: bin
|
11
|
+
cert_chain: []
|
12
|
+
date: 2015-03-12 00:00:00.000000000 Z
|
13
|
+
dependencies: []
|
14
|
+
description: data_transit relies on activerecord to generate database Models on the
|
15
|
+
fly. Tt is executed within a database transaction, and should any error occur during
|
16
|
+
data transit, it will cause the transaction to rollback. So don't worry about introducing
|
17
|
+
dirty data into your target database
|
18
|
+
email: thundercumt@126.com
|
19
|
+
executables:
|
20
|
+
- data_transit
|
21
|
+
extensions: []
|
22
|
+
extra_rdoc_files:
|
23
|
+
- README
|
24
|
+
- LICENSE
|
25
|
+
files:
|
26
|
+
- LICENSE
|
27
|
+
- README
|
28
|
+
- Rakefile
|
29
|
+
- Database.yml
|
30
|
+
- bin/data_transit
|
31
|
+
- bin/data_transit.bat
|
32
|
+
- lib/datatransit/clean_find_in_batches.rb
|
33
|
+
- lib/datatransit/cli.rb
|
34
|
+
- lib/datatransit/database.rb
|
35
|
+
- lib/datatransit/helper.rb
|
36
|
+
- lib/datatransit/model/tables_source.rb
|
37
|
+
- lib/datatransit/model/tables_target.rb
|
38
|
+
- lib/datatransit/model_dumper.rb
|
39
|
+
- lib/datatransit/rule_dsl.rb
|
40
|
+
- lib/datatransit/tasks.rb
|
41
|
+
homepage: https://github.com/thundercumt/data_transit
|
42
|
+
licenses: []
|
43
|
+
post_install_message:
|
44
|
+
rdoc_options: []
|
45
|
+
require_paths:
|
46
|
+
- lib
|
47
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
48
|
+
none: false
|
49
|
+
requirements:
|
50
|
+
- - ! '>='
|
51
|
+
- !ruby/object:Gem::Version
|
52
|
+
version: '0'
|
53
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
54
|
+
none: false
|
55
|
+
requirements:
|
56
|
+
- - ! '>='
|
57
|
+
- !ruby/object:Gem::Version
|
58
|
+
version: '0'
|
59
|
+
requirements: []
|
60
|
+
rubyforge_project:
|
61
|
+
rubygems_version: 1.8.28
|
62
|
+
signing_key:
|
63
|
+
specification_version: 3
|
64
|
+
summary: a ruby gem/app used to migrate between databases, supporting customized migration
|
65
|
+
procedure
|
66
|
+
test_files: []
|