data_transit 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/Database.yml +27 -0
- data/LICENSE +22 -0
- data/README +150 -0
- data/Rakefile +47 -0
- data/bin/data_transit +40 -0
- data/bin/data_transit.bat +6 -0
- data/lib/datatransit/clean_find_in_batches.rb +34 -0
- data/lib/datatransit/cli.rb +66 -0
- data/lib/datatransit/database.rb +21 -0
- data/lib/datatransit/helper.rb +26 -0
- data/lib/datatransit/model/tables_source.rb +18 -0
- data/lib/datatransit/model/tables_target.rb +18 -0
- data/lib/datatransit/model_dumper.rb +65 -0
- data/lib/datatransit/rule_dsl.rb +235 -0
- data/lib/datatransit/tasks.rb +71 -0
- metadata +66 -0
data/Database.yml
ADDED
@@ -0,0 +1,27 @@
|
|
1
|
+
source:
|
2
|
+
adapter: oracle_enhanced
|
3
|
+
host: 10.162.211.9
|
4
|
+
port: 1521
|
5
|
+
database: ztjc2
|
6
|
+
username: mw_ztjc
|
7
|
+
password: ztjc
|
8
|
+
|
9
|
+
target:
|
10
|
+
adapter: oracle_enhanced
|
11
|
+
host: 10.162.106.153
|
12
|
+
port: 1521
|
13
|
+
database: orcl
|
14
|
+
username: mw_app
|
15
|
+
password: mw_app
|
16
|
+
|
17
|
+
test_target:
|
18
|
+
adapter: sqlite3
|
19
|
+
database: development.sqlite3
|
20
|
+
pool: 5
|
21
|
+
timeout: 5000
|
22
|
+
|
23
|
+
production:
|
24
|
+
adapter: oracle
|
25
|
+
database: comics_catalog_production
|
26
|
+
username: comics_catalog
|
27
|
+
password:
|
data/LICENSE
ADDED
@@ -0,0 +1,22 @@
|
|
1
|
+
The MIT License (MIT)
|
2
|
+
|
3
|
+
Copyright (c) 2015 tianyuan
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
7
|
+
in the Software without restriction, including without limitation the rights
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
10
|
+
furnished to do so, subject to the following conditions:
|
11
|
+
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
13
|
+
copies or substantial portions of the Software.
|
14
|
+
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
21
|
+
SOFTWARE.
|
22
|
+
|
data/README
ADDED
@@ -0,0 +1,150 @@
|
|
1
|
+
data_transit
|
2
|
+
data_transit is a ruby gem/app used to migrate between different relational
|
3
|
+
databases, supporting customized migration procedure.
|
4
|
+
|
5
|
+
|
6
|
+
#1 Introduction
|
7
|
+
data_transit relies on activerecord to generate database Models on the fly.
|
8
|
+
Tt is executed within a database transaction, and should any error occur during
|
9
|
+
data transit, it will cause the transaction to rollback. So don't worry about
|
10
|
+
introducing dirty data into your target database.
|
11
|
+
|
12
|
+
|
13
|
+
#2 Install
|
14
|
+
data_transit can be installed using gem
|
15
|
+
gem install data_transit
|
16
|
+
or
|
17
|
+
download data_transit.gem
|
18
|
+
gem install --local /where/you/put/data_transit.gem
|
19
|
+
|
20
|
+
In your command line, input data_transit, You can proceed to "how to use" if you
|
21
|
+
can see message prompt as below.
|
22
|
+
data_transit
|
23
|
+
usage: data_transit command args. 4 commands are listed below...
|
24
|
+
|
25
|
+
|
26
|
+
#3 How to use
|
27
|
+
|
28
|
+
#3.1 Config DataBase Connection
|
29
|
+
data_transit setup_db_conn /your/yaml/database/config/file
|
30
|
+
|
31
|
+
Your db config file should be compatible with the activerecord adapter you are
|
32
|
+
using and configured properly. As the name suggests, source designates which
|
33
|
+
database you plan to copy data from, while target means the copy destination.
|
34
|
+
Note the key 'source' and 'target' should be kept unchanged!
|
35
|
+
For example, here is sample file for oracle dbms.
|
36
|
+
|
37
|
+
database.yml
|
38
|
+
source:#don't change this line
|
39
|
+
adapter: oracle_enhanced
|
40
|
+
database: dbserver
|
41
|
+
username: xxx
|
42
|
+
password: secret
|
43
|
+
|
44
|
+
target:#don't change this line
|
45
|
+
adapter: oracle_enhanced
|
46
|
+
database: orcl
|
47
|
+
username: copy1
|
48
|
+
password: cipher
|
49
|
+
|
50
|
+
|
51
|
+
#3.2 Create Database Schema(optional)
|
52
|
+
If you can have your target database schema created, move to 3.3 "copy data",
|
53
|
+
otherwise use your database specific tools to generate an identical schema in
|
54
|
+
your target database. Or if you don't have a handy tool to generate the schema,
|
55
|
+
for instance when you need to migrate between different database systems, you
|
56
|
+
can use data_transit to dump a schema description file based on your source
|
57
|
+
database schema, and then use this file to create your target schema.
|
58
|
+
|
59
|
+
Note this gem is built on activerecord, so it should work well for the database
|
60
|
+
schema compatible with rails conventions. Example, a single-column primary key
|
61
|
+
rather than a compound primary key (primary key with more than one columns),
|
62
|
+
primary key of integer type instead of other types like guid, timestamp etc.
|
63
|
+
|
64
|
+
In data_transit, I coded against the situation where non-integer is used,
|
65
|
+
therefore resulting a minor problem that the batch-query feature provided by
|
66
|
+
activerecord can not be used because of its dependency on integer primary key.
|
67
|
+
In this special case, careless selection of copy range might overburden network,
|
68
|
+
database server because all data in the specified range are transmitted from the
|
69
|
+
source database and then inserted into the target database.
|
70
|
+
|
71
|
+
|
72
|
+
#3.2.1 Dump Source Database Schema
|
73
|
+
data_transit dump_schema [schema_file] [rule_file]
|
74
|
+
[schema_file] will be used to contain dumped schema, [rule_file] describes how
|
75
|
+
your want data_transit to copy your data.
|
76
|
+
|
77
|
+
Note if your source schema is somewhat a legacy, you might need some manual work
|
78
|
+
to adjust the generated schema_file to better meet your needs.
|
79
|
+
|
80
|
+
For example, in my test, the source schema uses obj_id as id, and uses guid as
|
81
|
+
primary key type, so I need to impede activerecord auto-generating "id" column
|
82
|
+
by removing primary_key => "obj_id" and adding :id=>false, and then appending
|
83
|
+
primary key constraint in the end of each table definition. See below.
|
84
|
+
|
85
|
+
here is an example dumped schema file
|
86
|
+
|
87
|
+
ActiveRecord::Schema.define(:version => 0) do
|
88
|
+
create_table "table_one", primary_key => "obj_id", :force => true do |t|
|
89
|
+
#other fields
|
90
|
+
end
|
91
|
+
#other tables
|
92
|
+
end
|
93
|
+
|
94
|
+
and I manually changed the schema definition to
|
95
|
+
|
96
|
+
ActiveRecord::Schema.define(:version => 0) do
|
97
|
+
create_table "table_one", :id => false, :force => true do |t|
|
98
|
+
t.string "obj_id", :limit => 42
|
99
|
+
#other fields
|
100
|
+
end
|
101
|
+
execute "alter table table_one add primary key(obj_id)"
|
102
|
+
end
|
103
|
+
|
104
|
+
#3.2.2 Create Target Database Schema
|
105
|
+
data_transit create_table [schema_file]
|
106
|
+
If everything goes well, you will see a bunch of ddl execution history.
|
107
|
+
|
108
|
+
|
109
|
+
#3.3 Copy Data
|
110
|
+
data_transit copy_data [rule_file]
|
111
|
+
[rule_file] contains your copy logic. For security reasons, I changed table names
|
112
|
+
and it looks as follows.
|
113
|
+
|
114
|
+
|
115
|
+
start_date = "2015-01-01 00:00:00"
|
116
|
+
end_date = "2015-02-01 00:00:00"
|
117
|
+
|
118
|
+
migrate do
|
119
|
+
choose_table "APP.TABLE1","APP.TABLE2","APP.TABLE3","APP.TABLE4","APP.TABLE5","APP.TABLE6"
|
120
|
+
batch_by "ACQUISITION_TIME BETWEEN TO_DATE('#{start_date}','yyyy-mm-dd hh24:mi:ss') AND TO_DATE('#{end_date}', 'yyyy-mm-dd hh24:mi:ss')"
|
121
|
+
register_primary_key 'OBJ_ID'
|
122
|
+
end
|
123
|
+
|
124
|
+
migrate do
|
125
|
+
choose_table "APP.TABLE7","APP.TABLE8","APP.TABLE9","APP.TABLE10","APP.TABLE11","APP.TABLE12"
|
126
|
+
batch_by "ACKTIME BETWEEN TO_DATE('#{start_date}','yyyy-mm-dd hh24:mi:ss') AND TO_DATE('#{end_date}', 'yyyy-mm-dd hh24:mi:ss')"
|
127
|
+
register_primary_key 'OBJ_ID'
|
128
|
+
end
|
129
|
+
|
130
|
+
migrate do
|
131
|
+
choose_table "APP.TABLE13","APP.TABLE14","APP.TABLE15","APP.TABLE16","APP.TABLE17","APP.TABLE18"
|
132
|
+
batch_by "1>0" #query all data because these tables don't have a reasonable range
|
133
|
+
register_primary_key 'OBJ_ID'
|
134
|
+
pre_work do |targetCls| targetCls.delete_all("1>0") end #delete all in target
|
135
|
+
#post_work do |targetCls| end
|
136
|
+
end
|
137
|
+
|
138
|
+
|
139
|
+
Each migrate block contains a data_transit task.
|
140
|
+
|
141
|
+
"choose_table" describes which tables are included in this task. These tables
|
142
|
+
share some nature in common, and can be processed with the same rule.
|
143
|
+
|
144
|
+
"batch_by" is the query condition
|
145
|
+
|
146
|
+
"register_primary_key" describes the primary key of the tables.
|
147
|
+
|
148
|
+
"pre_work" is a block executed before each table is processed.
|
149
|
+
|
150
|
+
"post_work" is a block executed after each table is processed.
|
data/Rakefile
ADDED
@@ -0,0 +1,47 @@
|
|
1
|
+
#
|
2
|
+
# To change this license header, choose License Headers in Project Properties.
|
3
|
+
# To change this template file, choose Tools | Templates
|
4
|
+
# and open the template in the editor.
|
5
|
+
|
6
|
+
|
7
|
+
require 'rubygems'
|
8
|
+
require 'rake'
|
9
|
+
require 'rake/clean'
|
10
|
+
require 'rubygems/package_task'
|
11
|
+
require 'rdoc/task'
|
12
|
+
require 'rake/testtask'
|
13
|
+
|
14
|
+
spec = Gem::Specification.new do |s|
|
15
|
+
s.name = 'data_transit'
|
16
|
+
s.version = '0.2.0'
|
17
|
+
s.has_rdoc = true
|
18
|
+
s.extra_rdoc_files = ['README', 'LICENSE']
|
19
|
+
s.summary = 'a ruby gem/app used to migrate between databases, supporting customized migration procedure'
|
20
|
+
s.description = 'data_transit relies on activerecord to generate database Models on the fly. Tt is executed within a database transaction, and should any error occur during data transit, it will cause the transaction to rollback. So don\'t worry about introducing dirty data into your target database'
|
21
|
+
s.author = 'thundercumt'
|
22
|
+
s.email = 'thundercumt@126.com'
|
23
|
+
s.homepage = 'https://github.com/thundercumt/data_transit'
|
24
|
+
s.executables << "data_transit"
|
25
|
+
s.files = %w(LICENSE README Rakefile Database.yml) + Dir.glob("{bin,lib,spec}/**/*")
|
26
|
+
s.require_path = "lib"
|
27
|
+
s.bindir = "bin"
|
28
|
+
end
|
29
|
+
|
30
|
+
Gem::PackageTask.new(spec) do |p|
|
31
|
+
p.gem_spec = spec
|
32
|
+
p.need_tar = true
|
33
|
+
p.need_zip = true
|
34
|
+
end
|
35
|
+
|
36
|
+
Rake::RDocTask.new do |rdoc|
|
37
|
+
files =['README', 'LICENSE', 'lib/**/*.rb']
|
38
|
+
rdoc.rdoc_files.add(files)
|
39
|
+
rdoc.main = "README" # page to start on
|
40
|
+
rdoc.title = "Data_Transit Docs"
|
41
|
+
rdoc.rdoc_dir = 'doc/rdoc' # rdoc output folder
|
42
|
+
rdoc.options << '--line-numbers'
|
43
|
+
end
|
44
|
+
|
45
|
+
Rake::TestTask.new do |t|
|
46
|
+
t.test_files = FileList['test/**/*.rb']
|
47
|
+
end
|
data/bin/data_transit
ADDED
@@ -0,0 +1,40 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
|
3
|
+
require File::expand_path('../../lib/datatransit/cli', __FILE__)
|
4
|
+
|
5
|
+
command = ARGV[0]
|
6
|
+
|
7
|
+
help_msg = "usage: data_transit command args. 4 commands are listed below\n
|
8
|
+
1 data_transit setup_db_conn [conn.yml]
|
9
|
+
conn.yml: the yml file describing db connection and adapters\n
|
10
|
+
2 data_transit dump_schema [schema_file] [rule_file]
|
11
|
+
schema_file: the file to store source database schema
|
12
|
+
rule_file: the user migration rule\n
|
13
|
+
3 data_transit create_table [schema_file]
|
14
|
+
schema_file: the file generated by dump_schema\n
|
15
|
+
4 data_transit copy_data [rule_file]
|
16
|
+
rule_file: the user migration rule"
|
17
|
+
|
18
|
+
case command
|
19
|
+
when "setup_db_conn"
|
20
|
+
yml = ARGV[1]
|
21
|
+
if File::exists?(yml)
|
22
|
+
setup_db_conn yml
|
23
|
+
else
|
24
|
+
print "file #{yml} does not exist!\n"
|
25
|
+
end
|
26
|
+
|
27
|
+
when "dump_schema"
|
28
|
+
schema = ARGV[1]
|
29
|
+
rule = ARGV[2]
|
30
|
+
if not (schema =~ /\.rb$/)
|
31
|
+
schema += ".rb"
|
32
|
+
end
|
33
|
+
dump_schema schema, rule
|
34
|
+
when "create_table"
|
35
|
+
create_tables ARGV[1]
|
36
|
+
when "copy_data"
|
37
|
+
copy_data ARGV[1]
|
38
|
+
else
|
39
|
+
puts help_msg
|
40
|
+
end
|
@@ -0,0 +1,34 @@
|
|
1
|
+
module CleanFindInBatches
|
2
|
+
|
3
|
+
def self.included(base)
|
4
|
+
base.class_eval do
|
5
|
+
alias :old_find_in_batches :find_in_batches
|
6
|
+
alias :find_in_batches :replacement_find_in_batches
|
7
|
+
end
|
8
|
+
end
|
9
|
+
|
10
|
+
# Override due to implementation of regular find_in_batches
|
11
|
+
# conflicting using UUIDs
|
12
|
+
def replacement_find_in_batches(options = {}, &block)
|
13
|
+
relation = self
|
14
|
+
return old_find_in_batches(options, &block) if relation.primary_key.is_a?(Arel::Attributes::Integer)
|
15
|
+
# Throw errors like the real thing
|
16
|
+
if (finder_options = options.except(:batch_size)).present?
|
17
|
+
raise "You can't specify an order, it's forced to be #{batch_order}" if options[:order].present?
|
18
|
+
raise "You can't specify a limit, it's forced to be the batch_size" if options[:limit].present?
|
19
|
+
raise 'You can\'t specify start, it\'s forced to be 0 because the ID is a string' if options.delete(:start)
|
20
|
+
relation = apply_finder_options(finder_options)
|
21
|
+
end
|
22
|
+
# Compute the batch size
|
23
|
+
batch_size = options.delete(:batch_size) || 1000
|
24
|
+
offset = 0
|
25
|
+
# Get the relation and keep going over it until there's nothing left
|
26
|
+
relation = relation.except(:order).order(batch_order).limit(batch_size)
|
27
|
+
while (results = relation.offset(offset).limit(batch_size).all).any?
|
28
|
+
block.call results
|
29
|
+
offset += batch_size
|
30
|
+
end
|
31
|
+
nil
|
32
|
+
end
|
33
|
+
|
34
|
+
end
|
@@ -0,0 +1,66 @@
|
|
1
|
+
# To change this license header, choose License Headers in Project Properties.
|
2
|
+
# To change this template file, choose Tools | Templates
|
3
|
+
# and open the template in the editor.
|
4
|
+
|
5
|
+
#require 'rubygems'
|
6
|
+
#require 'bundler/setup'
|
7
|
+
#Bundler.require
|
8
|
+
|
9
|
+
#initial code
|
10
|
+
ENV['NLS_LANG'] = 'SIMPLIFIED CHINESE_CHINA.ZHS16GBK' if ENV['NLS_LANG'] == nil
|
11
|
+
# It is recommended to set time zone in TZ environment variable so that the same timezone will be used by Ruby and by Oracle session
|
12
|
+
ENV['TZ'] = 'UTC'
|
13
|
+
|
14
|
+
require 'active_record'
|
15
|
+
require File::expand_path('../clean_find_in_batches', __FILE__)
|
16
|
+
ActiveRecord::Batches.send(:include, CleanFindInBatches)
|
17
|
+
|
18
|
+
require File::expand_path('../database', __FILE__)
|
19
|
+
require File::expand_path('../model/tables_source', __FILE__)
|
20
|
+
require File::expand_path('../model/tables_target', __FILE__)
|
21
|
+
require File::expand_path('../model_dumper', __FILE__)
|
22
|
+
require File::expand_path('../rule_dsl', __FILE__)
|
23
|
+
|
24
|
+
def dump_schema schema_file, rule_file
|
25
|
+
puts "preparing to generate schema.rb, the schema of source database\n"
|
26
|
+
ActiveRecord::Base.establish_connection(DataTransit::Database.source)
|
27
|
+
#tables = DataTransit::Database.tables
|
28
|
+
worker = DTWorker.new rule_file
|
29
|
+
worker.load_work
|
30
|
+
tables = worker.get_all_tables
|
31
|
+
|
32
|
+
print tables, "\n"
|
33
|
+
|
34
|
+
dumper = DataTransit::ModelDumper.new tables
|
35
|
+
dumper.dump_tables File.open(schema_file, 'w');
|
36
|
+
ActiveRecord::Base.remove_connection
|
37
|
+
puts "schema.rb generated\n"
|
38
|
+
end
|
39
|
+
|
40
|
+
def create_tables schema_file
|
41
|
+
puts "\npreparing to create tables in the target database\n"
|
42
|
+
ActiveRecord::Base.establish_connection(DataTransit::Database.target)
|
43
|
+
require schema_file
|
44
|
+
ActiveRecord::Base.remove_connection
|
45
|
+
puts "tables created in target database\n"
|
46
|
+
end
|
47
|
+
|
48
|
+
|
49
|
+
def copy_data rule_file
|
50
|
+
worker = DTWorker.new rule_file
|
51
|
+
worker.load_work
|
52
|
+
worker.do_work
|
53
|
+
end
|
54
|
+
|
55
|
+
def setup_db_conn db_yml
|
56
|
+
if File::exists? db_yml
|
57
|
+
src = File::open(db_yml, "r")
|
58
|
+
dst = File::open( File.expand_path('../../../database.yml', __FILE__), "w" )
|
59
|
+
|
60
|
+
src.each_line { |line|
|
61
|
+
dst.write line
|
62
|
+
}
|
63
|
+
src.close
|
64
|
+
dst.close
|
65
|
+
end
|
66
|
+
end
|
@@ -0,0 +1,21 @@
|
|
1
|
+
# To change this license header, choose License Headers in Project Properties.
|
2
|
+
# To change this template file, choose Tools | Templates
|
3
|
+
# and open the template in the editor.
|
4
|
+
|
5
|
+
module DataTransit
|
6
|
+
|
7
|
+
class Database
|
8
|
+
|
9
|
+
@@dbconfig = YAML::load( File.open( File.expand_path('../../../database.yml', __FILE__) ) )
|
10
|
+
|
11
|
+
def self.source
|
12
|
+
@@dbconfig['source']
|
13
|
+
end
|
14
|
+
|
15
|
+
def self.target
|
16
|
+
@@dbconfig['target']
|
17
|
+
end
|
18
|
+
|
19
|
+
end
|
20
|
+
|
21
|
+
end
|
@@ -0,0 +1,26 @@
|
|
1
|
+
# To change this license header, choose License Headers in Project Properties.
|
2
|
+
# To change this template file, choose Tools | Templates
|
3
|
+
# and open the template in the editor.
|
4
|
+
|
5
|
+
module TimeExt
|
6
|
+
def fly_by_day days
|
7
|
+
Time.at(self.to_i + days * 24 * 60 * 60)
|
8
|
+
end
|
9
|
+
|
10
|
+
def fly_by_week weeks
|
11
|
+
Time.at(self.to_i + weeks * 7 * 24 * 60 * 60)
|
12
|
+
end
|
13
|
+
|
14
|
+
def fly_by_month months
|
15
|
+
Time.at(self.to_i + months * 30 * 24 * 60 * 60)
|
16
|
+
end
|
17
|
+
|
18
|
+
def midnight
|
19
|
+
Time.at(self.year, self.month, self.day, 24, 0, 0 )
|
20
|
+
end
|
21
|
+
|
22
|
+
end
|
23
|
+
|
24
|
+
class Time
|
25
|
+
include TimeExt
|
26
|
+
end
|
@@ -0,0 +1,18 @@
|
|
1
|
+
# To change this license header, choose License Headers in Project Properties.
|
2
|
+
# To change this template file, choose Tools | Templates
|
3
|
+
# and open the template in the editor.
|
4
|
+
|
5
|
+
require 'active_record'
|
6
|
+
require File::expand_path('../../database', __FILE__)
|
7
|
+
|
8
|
+
module DataTransit
|
9
|
+
|
10
|
+
module Source
|
11
|
+
|
12
|
+
class SourceBase < ActiveRecord::Base
|
13
|
+
self.abstract_class = true
|
14
|
+
establish_connection DataTransit::Database.source
|
15
|
+
end
|
16
|
+
end
|
17
|
+
|
18
|
+
end
|
@@ -0,0 +1,18 @@
|
|
1
|
+
# To change this license header, choose License Headers in Project Properties.
|
2
|
+
# To change this template file, choose Tools | Templates
|
3
|
+
# and open the template in the editor.
|
4
|
+
|
5
|
+
require 'active_record'
|
6
|
+
require File::expand_path('../../database', __FILE__)
|
7
|
+
|
8
|
+
module DataTransit
|
9
|
+
|
10
|
+
module Target
|
11
|
+
|
12
|
+
class TargetBase < ActiveRecord::Base
|
13
|
+
self.abstract_class = true
|
14
|
+
establish_connection DataTransit::Database.target
|
15
|
+
end
|
16
|
+
end
|
17
|
+
|
18
|
+
end
|
@@ -0,0 +1,65 @@
|
|
1
|
+
# To change this license header, choose License Headers in Project Properties.
|
2
|
+
# To change this template file, choose Tools | Templates
|
3
|
+
# and open the template in the editor.
|
4
|
+
|
5
|
+
module DataTransit
|
6
|
+
class GivenTableDumper < ActiveRecord::SchemaDumper
|
7
|
+
|
8
|
+
def self.give_tables tables
|
9
|
+
@@tables = tables
|
10
|
+
end
|
11
|
+
|
12
|
+
def dump(stream)
|
13
|
+
=begin
|
14
|
+
if all_tables_exist?
|
15
|
+
header(stream)
|
16
|
+
tables(stream)
|
17
|
+
trailer(stream)
|
18
|
+
else
|
19
|
+
print "No schema generated, because some table[s] do not exist!\n"
|
20
|
+
end
|
21
|
+
=end
|
22
|
+
header(stream)
|
23
|
+
tables(stream)
|
24
|
+
trailer(stream)
|
25
|
+
|
26
|
+
stream
|
27
|
+
end
|
28
|
+
|
29
|
+
def tables(stream)
|
30
|
+
@@tables.each do |tbl|
|
31
|
+
table(tbl, stream)
|
32
|
+
end
|
33
|
+
end
|
34
|
+
|
35
|
+
private
|
36
|
+
def all_tables_exist?
|
37
|
+
all_tables = @connection.tables
|
38
|
+
print all_tables, '!!!!!!!!!!!!!!!!!!!!111'
|
39
|
+
tables_all_exist = true
|
40
|
+
|
41
|
+
@@tables.each do |tbl|
|
42
|
+
unless all_tables.include? tbl
|
43
|
+
print "table [", tbl, "] doesn't exist!\n"
|
44
|
+
tables_all_exist = false
|
45
|
+
end
|
46
|
+
end
|
47
|
+
|
48
|
+
tables_all_exist
|
49
|
+
end
|
50
|
+
end
|
51
|
+
|
52
|
+
|
53
|
+
class ModelDumper
|
54
|
+
|
55
|
+
def initialize table_list
|
56
|
+
@tables = table_list
|
57
|
+
end
|
58
|
+
|
59
|
+
def dump_tables (stream)
|
60
|
+
GivenTableDumper.give_tables @tables
|
61
|
+
GivenTableDumper.dump ActiveRecord::Base.connection, stream
|
62
|
+
end
|
63
|
+
end
|
64
|
+
|
65
|
+
end
|
@@ -0,0 +1,235 @@
|
|
1
|
+
# To change this license header, choose License Headers in Project Properties.
|
2
|
+
# To change this template file, choose Tools | Templates
|
3
|
+
# and open the template in the editor.
|
4
|
+
|
5
|
+
#chain_filter1 --> chain_filter2 --> chain_filter3
|
6
|
+
|
7
|
+
require 'singleton'
|
8
|
+
require 'progress_bar'
|
9
|
+
|
10
|
+
#require File::expand_path('../database', __FILE__)
|
11
|
+
require File::expand_path('../model/tables_source', __FILE__)
|
12
|
+
require File::expand_path('../model/tables_target', __FILE__)
|
13
|
+
|
14
|
+
module RULE_DSL
|
15
|
+
|
16
|
+
class Task
|
17
|
+
attr_accessor :tables, :search_cond, :pk, :pre_work, :post_work, :filter
|
18
|
+
end
|
19
|
+
|
20
|
+
class TaskSet
|
21
|
+
def initialize
|
22
|
+
@tasks = []
|
23
|
+
end
|
24
|
+
|
25
|
+
def push_back task
|
26
|
+
@tasks << task
|
27
|
+
end
|
28
|
+
|
29
|
+
def pop_head
|
30
|
+
@tasks.delete_at 0
|
31
|
+
end
|
32
|
+
|
33
|
+
def first_task
|
34
|
+
@tasks[0]
|
35
|
+
end
|
36
|
+
|
37
|
+
def last_task
|
38
|
+
@tasks[@tasks.length - 1]
|
39
|
+
end
|
40
|
+
|
41
|
+
def length
|
42
|
+
@tasks.length
|
43
|
+
end
|
44
|
+
|
45
|
+
def task_at idx
|
46
|
+
@tasks[idx]
|
47
|
+
end
|
48
|
+
|
49
|
+
def each
|
50
|
+
for i in 0 .. @tasks.length-1
|
51
|
+
yield @tasks[i]
|
52
|
+
end
|
53
|
+
end
|
54
|
+
end
|
55
|
+
|
56
|
+
def load_rule *rule_file
|
57
|
+
rule_file.each{ |file| self.instance_eval File.read(file), file }
|
58
|
+
end
|
59
|
+
|
60
|
+
def migrate &script
|
61
|
+
@taskset ||= TaskSet.new
|
62
|
+
task = Task.new
|
63
|
+
@taskset.push_back task
|
64
|
+
yield
|
65
|
+
end
|
66
|
+
|
67
|
+
def choose_table *tables
|
68
|
+
task = @taskset.last_task
|
69
|
+
task.tables = tables
|
70
|
+
end
|
71
|
+
|
72
|
+
def batch_by search_cond
|
73
|
+
task = @taskset.last_task
|
74
|
+
task.search_cond = search_cond
|
75
|
+
end
|
76
|
+
|
77
|
+
def register_primary_key *key
|
78
|
+
task = @taskset.last_task
|
79
|
+
task.pk = key.map(&:downcase)
|
80
|
+
end
|
81
|
+
|
82
|
+
def filter_out_with &filter
|
83
|
+
task = @taskset.last_task
|
84
|
+
task.filter = filter
|
85
|
+
end
|
86
|
+
|
87
|
+
def pre_work &pre
|
88
|
+
task = @taskset.last_task
|
89
|
+
task.pre_work = pre
|
90
|
+
end
|
91
|
+
|
92
|
+
def post_work &post
|
93
|
+
task = @taskset.last_task
|
94
|
+
task.post_work = post
|
95
|
+
end
|
96
|
+
|
97
|
+
def get_all_tables
|
98
|
+
tables = []
|
99
|
+
@taskset.each { |task| tables << task.tables }
|
100
|
+
tables.flatten!
|
101
|
+
end
|
102
|
+
end
|
103
|
+
|
104
|
+
class DTWorker
|
105
|
+
include RULE_DSL
|
106
|
+
attr_accessor :batch_size
|
107
|
+
|
108
|
+
def initialize rule_file
|
109
|
+
@rule_file = rule_file
|
110
|
+
@batch_size = 500
|
111
|
+
end
|
112
|
+
|
113
|
+
def load_work
|
114
|
+
load_rule @rule_file
|
115
|
+
end
|
116
|
+
|
117
|
+
def do_work
|
118
|
+
#we copy all or nothing. a mess is not welcome here
|
119
|
+
#on failures, transaction will rollback automatically
|
120
|
+
DataTransit::Target::TargetBase.transaction do
|
121
|
+
@taskset.each do |task|
|
122
|
+
do_task task
|
123
|
+
end
|
124
|
+
end
|
125
|
+
end
|
126
|
+
|
127
|
+
def do_task task
|
128
|
+
pks = task.pk#a context-free-variable in context switches
|
129
|
+
tables = task.tables
|
130
|
+
|
131
|
+
tables.each do |tbl|
|
132
|
+
sourceCls = Class.new(DataTransit::Source::SourceBase) do
|
133
|
+
self.table_name = tbl
|
134
|
+
end
|
135
|
+
|
136
|
+
#columns = sourceCls.columns.map(&:name).map(&:downcase)
|
137
|
+
columns = sourceCls.columns
|
138
|
+
pk_column = get_pk_column(columns, pks)
|
139
|
+
if pk_column != nil
|
140
|
+
pk, pk_type = pk_column.name, pk_column.type
|
141
|
+
else
|
142
|
+
pk, pk_type = nil, nil
|
143
|
+
end
|
144
|
+
|
145
|
+
sourceCls.instance_eval( "self.primary_key = \"#{pk}\"") if pk != nil
|
146
|
+
|
147
|
+
targetCls = Class.new(DataTransit::Target::TargetBase) do
|
148
|
+
self.table_name = tbl
|
149
|
+
end
|
150
|
+
targetCls.instance_eval( "self.primary_key = \"#{pk}\"") if pk != nil
|
151
|
+
|
152
|
+
print "\ntable ", tbl, ":\n"
|
153
|
+
do_user_ar_proc targetCls, task.pre_work if task.pre_work
|
154
|
+
do_batch_copy sourceCls, targetCls, task, pk, pk_type=="integer"
|
155
|
+
do_user_ar_proc targetCls, task.post_work if task.post_work
|
156
|
+
end
|
157
|
+
end
|
158
|
+
|
159
|
+
def do_user_ar_proc targetCls, proc
|
160
|
+
proc.call targetCls
|
161
|
+
end
|
162
|
+
|
163
|
+
def do_batch_copy (sourceCls, targetCls, task, pk = nil, in_batch = false)
|
164
|
+
count = sourceCls.where(task.search_cond).size.to_f
|
165
|
+
return if count <= 0
|
166
|
+
|
167
|
+
how_many_batch = (count / @batch_size).ceil
|
168
|
+
#the progress bar
|
169
|
+
bar = ProgressBar.new(count)
|
170
|
+
|
171
|
+
if in_batch
|
172
|
+
0.upto (how_many_batch-1) do |i|
|
173
|
+
sourceCls.where(task.search_cond).find_each(
|
174
|
+
start: i * @batch_size, batch_size: @batch_size) do |source_row|
|
175
|
+
|
176
|
+
#update progress
|
177
|
+
bar.increment!
|
178
|
+
|
179
|
+
if task.filter
|
180
|
+
next if do_filter_out task.filter, source_row
|
181
|
+
end
|
182
|
+
target_row = targetCls.new source_row.attributes
|
183
|
+
|
184
|
+
#activerecord would ignore pk field, and the above initialization will result nill primary key.
|
185
|
+
#here the original pk is used in the target_row, it is what we need exactly.
|
186
|
+
if pk
|
187
|
+
target_row.send( "#{pk}=", source_row.send("#{pk}") )
|
188
|
+
end
|
189
|
+
|
190
|
+
target_row.save
|
191
|
+
end
|
192
|
+
end
|
193
|
+
else
|
194
|
+
sourceCls.where(task.search_cond).each do |source_row|
|
195
|
+
#update progress
|
196
|
+
bar.increment!
|
197
|
+
|
198
|
+
if task.filter
|
199
|
+
next if do_filter_out task.filter, source_row
|
200
|
+
end
|
201
|
+
target_row = targetCls.new source_row.attributes
|
202
|
+
|
203
|
+
#activerecord would ignore pk field, and the above initialization will result nill primary key.
|
204
|
+
#here the original pk is used in the target_row, it is what we need exactly.
|
205
|
+
if pk
|
206
|
+
target_row.send( "#{pk}=", source_row.send("#{pk}") )
|
207
|
+
end
|
208
|
+
|
209
|
+
target_row.save
|
210
|
+
end
|
211
|
+
end
|
212
|
+
|
213
|
+
|
214
|
+
end
|
215
|
+
|
216
|
+
def do_filter_out filter, row
|
217
|
+
if filter.call row
|
218
|
+
return true
|
219
|
+
end
|
220
|
+
false
|
221
|
+
end
|
222
|
+
|
223
|
+
private
|
224
|
+
def get_pk_column(columns, given_pk)
|
225
|
+
column_names = columns.map(&:name).map(&:downcase)
|
226
|
+
pk = column_names & given_pk
|
227
|
+
if pk && pk.length > 0
|
228
|
+
pk = pk[0]
|
229
|
+
return columns[column_names.index(pk)]
|
230
|
+
else
|
231
|
+
return nil
|
232
|
+
end
|
233
|
+
end
|
234
|
+
|
235
|
+
end
|
@@ -0,0 +1,71 @@
|
|
1
|
+
# To change this license header, choose License Headers in Project Properties.
|
2
|
+
# To change this template file, choose Tools | Templates
|
3
|
+
# and open the template in the editor.
|
4
|
+
require 'rubygems'
|
5
|
+
require 'bundler/setup'
|
6
|
+
Bundler.require
|
7
|
+
|
8
|
+
#initial code
|
9
|
+
ENV['NLS_LANG'] = 'SIMPLIFIED CHINESE_CHINA.ZHS16GBK' if ENV['NLS_LANG'] == nil
|
10
|
+
# It is recommended to set time zone in TZ environment variable so that the same timezone will be used by Ruby and by Oracle session
|
11
|
+
ENV['TZ'] = 'UTC'
|
12
|
+
|
13
|
+
|
14
|
+
namespace :db do
|
15
|
+
|
16
|
+
desc "environment verification"
|
17
|
+
task :environment do
|
18
|
+
require 'active_record'
|
19
|
+
#require 'activerecord-oracle_enhanced-adapter'
|
20
|
+
#require 'oci8'
|
21
|
+
|
22
|
+
require File::expand_path('../database', __FILE__)
|
23
|
+
require File::expand_path('../model/tables_source', __FILE__)
|
24
|
+
require File::expand_path('../model/tables_target', __FILE__)
|
25
|
+
require File::expand_path('../model_dumper', __FILE__)
|
26
|
+
end
|
27
|
+
|
28
|
+
desc "dsl related definitions"
|
29
|
+
task :dsl => :environment do
|
30
|
+
require File::expand_path('../rule_dsl', __FILE__)
|
31
|
+
end
|
32
|
+
|
33
|
+
desc "generate schema.rb, the schema of source database"
|
34
|
+
task :dump_schema => :dsl do
|
35
|
+
puts "preparing to generate schema.rb, the schema of source database\n"
|
36
|
+
ActiveRecord::Base.establish_connection(DataTransit::Database.source)
|
37
|
+
#tables = DataTransit::Database.tables
|
38
|
+
worker = DTWorker.new File::expand_path('../../rule.rb', __FILE__)
|
39
|
+
worker.load_work
|
40
|
+
tables = worker.tables
|
41
|
+
|
42
|
+
print tables, "\n"
|
43
|
+
|
44
|
+
dumper = DataTransit::ModelDumper.new tables
|
45
|
+
dumper.dump_tables File.open(File::expand_path('../schema.rb', __FILE__), 'w');
|
46
|
+
ActiveRecord::Base.remove_connection
|
47
|
+
puts "schema.rb generated\n"
|
48
|
+
end
|
49
|
+
|
50
|
+
desc "use schema.rb to schema created in target database"
|
51
|
+
task :create_tables => :environment do
|
52
|
+
puts "\npreparing to create tables in the target database\n"
|
53
|
+
ActiveRecord::Base.establish_connection(DataTransit::Database.target)
|
54
|
+
require File.expand_path("../schema", __FILE__)
|
55
|
+
ActiveRecord::Base.remove_connection
|
56
|
+
puts "tables created in target database\n"
|
57
|
+
end
|
58
|
+
|
59
|
+
desc "data transit, copy rows from source db tables to target db tables\n it supports incremental copy by additional arguments"
|
60
|
+
task :copy_data, [] => :dsl do
|
61
|
+
worker = DTWorker.new File::expand_path('../../rule.rb', __FILE__)
|
62
|
+
worker.do_work
|
63
|
+
end
|
64
|
+
|
65
|
+
#task :my, [:arg1, :arg2, :arg3, :arg4] => :environment do |t, args|
|
66
|
+
#print t, args
|
67
|
+
#tables = DataTransit::Database.tables
|
68
|
+
#print tables
|
69
|
+
#end
|
70
|
+
|
71
|
+
end
|
metadata
ADDED
@@ -0,0 +1,66 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: data_transit
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.2.0
|
5
|
+
prerelease:
|
6
|
+
platform: ruby
|
7
|
+
authors:
|
8
|
+
- thundercumt
|
9
|
+
autorequire:
|
10
|
+
bindir: bin
|
11
|
+
cert_chain: []
|
12
|
+
date: 2015-03-12 00:00:00.000000000 Z
|
13
|
+
dependencies: []
|
14
|
+
description: data_transit relies on activerecord to generate database Models on the
|
15
|
+
fly. Tt is executed within a database transaction, and should any error occur during
|
16
|
+
data transit, it will cause the transaction to rollback. So don't worry about introducing
|
17
|
+
dirty data into your target database
|
18
|
+
email: thundercumt@126.com
|
19
|
+
executables:
|
20
|
+
- data_transit
|
21
|
+
extensions: []
|
22
|
+
extra_rdoc_files:
|
23
|
+
- README
|
24
|
+
- LICENSE
|
25
|
+
files:
|
26
|
+
- LICENSE
|
27
|
+
- README
|
28
|
+
- Rakefile
|
29
|
+
- Database.yml
|
30
|
+
- bin/data_transit
|
31
|
+
- bin/data_transit.bat
|
32
|
+
- lib/datatransit/clean_find_in_batches.rb
|
33
|
+
- lib/datatransit/cli.rb
|
34
|
+
- lib/datatransit/database.rb
|
35
|
+
- lib/datatransit/helper.rb
|
36
|
+
- lib/datatransit/model/tables_source.rb
|
37
|
+
- lib/datatransit/model/tables_target.rb
|
38
|
+
- lib/datatransit/model_dumper.rb
|
39
|
+
- lib/datatransit/rule_dsl.rb
|
40
|
+
- lib/datatransit/tasks.rb
|
41
|
+
homepage: https://github.com/thundercumt/data_transit
|
42
|
+
licenses: []
|
43
|
+
post_install_message:
|
44
|
+
rdoc_options: []
|
45
|
+
require_paths:
|
46
|
+
- lib
|
47
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
48
|
+
none: false
|
49
|
+
requirements:
|
50
|
+
- - ! '>='
|
51
|
+
- !ruby/object:Gem::Version
|
52
|
+
version: '0'
|
53
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
54
|
+
none: false
|
55
|
+
requirements:
|
56
|
+
- - ! '>='
|
57
|
+
- !ruby/object:Gem::Version
|
58
|
+
version: '0'
|
59
|
+
requirements: []
|
60
|
+
rubyforge_project:
|
61
|
+
rubygems_version: 1.8.28
|
62
|
+
signing_key:
|
63
|
+
specification_version: 3
|
64
|
+
summary: a ruby gem/app used to migrate between databases, supporting customized migration
|
65
|
+
procedure
|
66
|
+
test_files: []
|