data_transit 0.2.0

Sign up to get free protection for your applications and to get access to all the features.
data/Database.yml ADDED
@@ -0,0 +1,27 @@
1
+ source:
2
+ adapter: oracle_enhanced
3
+ host: 10.162.211.9
4
+ port: 1521
5
+ database: ztjc2
6
+ username: mw_ztjc
7
+ password: ztjc
8
+
9
+ target:
10
+ adapter: oracle_enhanced
11
+ host: 10.162.106.153
12
+ port: 1521
13
+ database: orcl
14
+ username: mw_app
15
+ password: mw_app
16
+
17
+ test_target:
18
+ adapter: sqlite3
19
+ database: development.sqlite3
20
+ pool: 5
21
+ timeout: 5000
22
+
23
+ production:
24
+ adapter: oracle
25
+ database: comics_catalog_production
26
+ username: comics_catalog
27
+ password:
data/LICENSE ADDED
@@ -0,0 +1,22 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2015 tianyuan
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
22
+
data/README ADDED
@@ -0,0 +1,150 @@
1
+ data_transit
2
+ data_transit is a ruby gem/app used to migrate between different relational
3
+ databases, supporting customized migration procedure.
4
+
5
+
6
+ #1 Introduction
7
+ data_transit relies on activerecord to generate database Models on the fly.
8
+ Tt is executed within a database transaction, and should any error occur during
9
+ data transit, it will cause the transaction to rollback. So don't worry about
10
+ introducing dirty data into your target database.
11
+
12
+
13
+ #2 Install
14
+ data_transit can be installed using gem
15
+ gem install data_transit
16
+ or
17
+ download data_transit.gem
18
+ gem install --local /where/you/put/data_transit.gem
19
+
20
+ In your command line, input data_transit, You can proceed to "how to use" if you
21
+ can see message prompt as below.
22
+ data_transit
23
+ usage: data_transit command args. 4 commands are listed below...
24
+
25
+
26
+ #3 How to use
27
+
28
+ #3.1 Config DataBase Connection
29
+ data_transit setup_db_conn /your/yaml/database/config/file
30
+
31
+ Your db config file should be compatible with the activerecord adapter you are
32
+ using and configured properly. As the name suggests, source designates which
33
+ database you plan to copy data from, while target means the copy destination.
34
+ Note the key 'source' and 'target' should be kept unchanged!
35
+ For example, here is sample file for oracle dbms.
36
+
37
+ database.yml
38
+ source:#don't change this line
39
+ adapter: oracle_enhanced
40
+ database: dbserver
41
+ username: xxx
42
+ password: secret
43
+
44
+ target:#don't change this line
45
+ adapter: oracle_enhanced
46
+ database: orcl
47
+ username: copy1
48
+ password: cipher
49
+
50
+
51
+ #3.2 Create Database Schema(optional)
52
+ If you can have your target database schema created, move to 3.3 "copy data",
53
+ otherwise use your database specific tools to generate an identical schema in
54
+ your target database. Or if you don't have a handy tool to generate the schema,
55
+ for instance when you need to migrate between different database systems, you
56
+ can use data_transit to dump a schema description file based on your source
57
+ database schema, and then use this file to create your target schema.
58
+
59
+ Note this gem is built on activerecord, so it should work well for the database
60
+ schema compatible with rails conventions. Example, a single-column primary key
61
+ rather than a compound primary key (primary key with more than one columns),
62
+ primary key of integer type instead of other types like guid, timestamp etc.
63
+
64
+ In data_transit, I coded against the situation where non-integer is used,
65
+ therefore resulting a minor problem that the batch-query feature provided by
66
+ activerecord can not be used because of its dependency on integer primary key.
67
+ In this special case, careless selection of copy range might overburden network,
68
+ database server because all data in the specified range are transmitted from the
69
+ source database and then inserted into the target database.
70
+
71
+
72
+ #3.2.1 Dump Source Database Schema
73
+ data_transit dump_schema [schema_file] [rule_file]
74
+ [schema_file] will be used to contain dumped schema, [rule_file] describes how
75
+ your want data_transit to copy your data.
76
+
77
+ Note if your source schema is somewhat a legacy, you might need some manual work
78
+ to adjust the generated schema_file to better meet your needs.
79
+
80
+ For example, in my test, the source schema uses obj_id as id, and uses guid as
81
+ primary key type, so I need to impede activerecord auto-generating "id" column
82
+ by removing primary_key => "obj_id" and adding :id=>false, and then appending
83
+ primary key constraint in the end of each table definition. See below.
84
+
85
+ here is an example dumped schema file
86
+
87
+ ActiveRecord::Schema.define(:version => 0) do
88
+ create_table "table_one", primary_key => "obj_id", :force => true do |t|
89
+ #other fields
90
+ end
91
+ #other tables
92
+ end
93
+
94
+ and I manually changed the schema definition to
95
+
96
+ ActiveRecord::Schema.define(:version => 0) do
97
+ create_table "table_one", :id => false, :force => true do |t|
98
+ t.string "obj_id", :limit => 42
99
+ #other fields
100
+ end
101
+ execute "alter table table_one add primary key(obj_id)"
102
+ end
103
+
104
+ #3.2.2 Create Target Database Schema
105
+ data_transit create_table [schema_file]
106
+ If everything goes well, you will see a bunch of ddl execution history.
107
+
108
+
109
+ #3.3 Copy Data
110
+ data_transit copy_data [rule_file]
111
+ [rule_file] contains your copy logic. For security reasons, I changed table names
112
+ and it looks as follows.
113
+
114
+
115
+ start_date = "2015-01-01 00:00:00"
116
+ end_date = "2015-02-01 00:00:00"
117
+
118
+ migrate do
119
+ choose_table "APP.TABLE1","APP.TABLE2","APP.TABLE3","APP.TABLE4","APP.TABLE5","APP.TABLE6"
120
+ batch_by "ACQUISITION_TIME BETWEEN TO_DATE('#{start_date}','yyyy-mm-dd hh24:mi:ss') AND TO_DATE('#{end_date}', 'yyyy-mm-dd hh24:mi:ss')"
121
+ register_primary_key 'OBJ_ID'
122
+ end
123
+
124
+ migrate do
125
+ choose_table "APP.TABLE7","APP.TABLE8","APP.TABLE9","APP.TABLE10","APP.TABLE11","APP.TABLE12"
126
+ batch_by "ACKTIME BETWEEN TO_DATE('#{start_date}','yyyy-mm-dd hh24:mi:ss') AND TO_DATE('#{end_date}', 'yyyy-mm-dd hh24:mi:ss')"
127
+ register_primary_key 'OBJ_ID'
128
+ end
129
+
130
+ migrate do
131
+ choose_table "APP.TABLE13","APP.TABLE14","APP.TABLE15","APP.TABLE16","APP.TABLE17","APP.TABLE18"
132
+ batch_by "1>0" #query all data because these tables don't have a reasonable range
133
+ register_primary_key 'OBJ_ID'
134
+ pre_work do |targetCls| targetCls.delete_all("1>0") end #delete all in target
135
+ #post_work do |targetCls| end
136
+ end
137
+
138
+
139
+ Each migrate block contains a data_transit task.
140
+
141
+ "choose_table" describes which tables are included in this task. These tables
142
+ share some nature in common, and can be processed with the same rule.
143
+
144
+ "batch_by" is the query condition
145
+
146
+ "register_primary_key" describes the primary key of the tables.
147
+
148
+ "pre_work" is a block executed before each table is processed.
149
+
150
+ "post_work" is a block executed after each table is processed.
data/Rakefile ADDED
@@ -0,0 +1,47 @@
1
+ #
2
+ # To change this license header, choose License Headers in Project Properties.
3
+ # To change this template file, choose Tools | Templates
4
+ # and open the template in the editor.
5
+
6
+
7
+ require 'rubygems'
8
+ require 'rake'
9
+ require 'rake/clean'
10
+ require 'rubygems/package_task'
11
+ require 'rdoc/task'
12
+ require 'rake/testtask'
13
+
14
+ spec = Gem::Specification.new do |s|
15
+ s.name = 'data_transit'
16
+ s.version = '0.2.0'
17
+ s.has_rdoc = true
18
+ s.extra_rdoc_files = ['README', 'LICENSE']
19
+ s.summary = 'a ruby gem/app used to migrate between databases, supporting customized migration procedure'
20
+ s.description = 'data_transit relies on activerecord to generate database Models on the fly. Tt is executed within a database transaction, and should any error occur during data transit, it will cause the transaction to rollback. So don\'t worry about introducing dirty data into your target database'
21
+ s.author = 'thundercumt'
22
+ s.email = 'thundercumt@126.com'
23
+ s.homepage = 'https://github.com/thundercumt/data_transit'
24
+ s.executables << "data_transit"
25
+ s.files = %w(LICENSE README Rakefile Database.yml) + Dir.glob("{bin,lib,spec}/**/*")
26
+ s.require_path = "lib"
27
+ s.bindir = "bin"
28
+ end
29
+
30
+ Gem::PackageTask.new(spec) do |p|
31
+ p.gem_spec = spec
32
+ p.need_tar = true
33
+ p.need_zip = true
34
+ end
35
+
36
+ Rake::RDocTask.new do |rdoc|
37
+ files =['README', 'LICENSE', 'lib/**/*.rb']
38
+ rdoc.rdoc_files.add(files)
39
+ rdoc.main = "README" # page to start on
40
+ rdoc.title = "Data_Transit Docs"
41
+ rdoc.rdoc_dir = 'doc/rdoc' # rdoc output folder
42
+ rdoc.options << '--line-numbers'
43
+ end
44
+
45
+ Rake::TestTask.new do |t|
46
+ t.test_files = FileList['test/**/*.rb']
47
+ end
data/bin/data_transit ADDED
@@ -0,0 +1,40 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require File::expand_path('../../lib/datatransit/cli', __FILE__)
4
+
5
+ command = ARGV[0]
6
+
7
+ help_msg = "usage: data_transit command args. 4 commands are listed below\n
8
+ 1 data_transit setup_db_conn [conn.yml]
9
+ conn.yml: the yml file describing db connection and adapters\n
10
+ 2 data_transit dump_schema [schema_file] [rule_file]
11
+ schema_file: the file to store source database schema
12
+ rule_file: the user migration rule\n
13
+ 3 data_transit create_table [schema_file]
14
+ schema_file: the file generated by dump_schema\n
15
+ 4 data_transit copy_data [rule_file]
16
+ rule_file: the user migration rule"
17
+
18
+ case command
19
+ when "setup_db_conn"
20
+ yml = ARGV[1]
21
+ if File::exists?(yml)
22
+ setup_db_conn yml
23
+ else
24
+ print "file #{yml} does not exist!\n"
25
+ end
26
+
27
+ when "dump_schema"
28
+ schema = ARGV[1]
29
+ rule = ARGV[2]
30
+ if not (schema =~ /\.rb$/)
31
+ schema += ".rb"
32
+ end
33
+ dump_schema schema, rule
34
+ when "create_table"
35
+ create_tables ARGV[1]
36
+ when "copy_data"
37
+ copy_data ARGV[1]
38
+ else
39
+ puts help_msg
40
+ end
@@ -0,0 +1,6 @@
1
+ @ECHO OFF
2
+ IF NOT "%~f0" == "~f0" GOTO :WinNT
3
+ @"ruby.exe" "data_transit" %1 %2 %3 %4 %5 %6 %7 %8 %9
4
+ GOTO :EOF
5
+ :WinNT
6
+ @"ruby.exe" "data_transit" %*
@@ -0,0 +1,34 @@
1
+ module CleanFindInBatches
2
+
3
+ def self.included(base)
4
+ base.class_eval do
5
+ alias :old_find_in_batches :find_in_batches
6
+ alias :find_in_batches :replacement_find_in_batches
7
+ end
8
+ end
9
+
10
+ # Override due to implementation of regular find_in_batches
11
+ # conflicting using UUIDs
12
+ def replacement_find_in_batches(options = {}, &block)
13
+ relation = self
14
+ return old_find_in_batches(options, &block) if relation.primary_key.is_a?(Arel::Attributes::Integer)
15
+ # Throw errors like the real thing
16
+ if (finder_options = options.except(:batch_size)).present?
17
+ raise "You can't specify an order, it's forced to be #{batch_order}" if options[:order].present?
18
+ raise "You can't specify a limit, it's forced to be the batch_size" if options[:limit].present?
19
+ raise 'You can\'t specify start, it\'s forced to be 0 because the ID is a string' if options.delete(:start)
20
+ relation = apply_finder_options(finder_options)
21
+ end
22
+ # Compute the batch size
23
+ batch_size = options.delete(:batch_size) || 1000
24
+ offset = 0
25
+ # Get the relation and keep going over it until there's nothing left
26
+ relation = relation.except(:order).order(batch_order).limit(batch_size)
27
+ while (results = relation.offset(offset).limit(batch_size).all).any?
28
+ block.call results
29
+ offset += batch_size
30
+ end
31
+ nil
32
+ end
33
+
34
+ end
@@ -0,0 +1,66 @@
1
+ # To change this license header, choose License Headers in Project Properties.
2
+ # To change this template file, choose Tools | Templates
3
+ # and open the template in the editor.
4
+
5
+ #require 'rubygems'
6
+ #require 'bundler/setup'
7
+ #Bundler.require
8
+
9
+ #initial code
10
+ ENV['NLS_LANG'] = 'SIMPLIFIED CHINESE_CHINA.ZHS16GBK' if ENV['NLS_LANG'] == nil
11
+ # It is recommended to set time zone in TZ environment variable so that the same timezone will be used by Ruby and by Oracle session
12
+ ENV['TZ'] = 'UTC'
13
+
14
+ require 'active_record'
15
+ require File::expand_path('../clean_find_in_batches', __FILE__)
16
+ ActiveRecord::Batches.send(:include, CleanFindInBatches)
17
+
18
+ require File::expand_path('../database', __FILE__)
19
+ require File::expand_path('../model/tables_source', __FILE__)
20
+ require File::expand_path('../model/tables_target', __FILE__)
21
+ require File::expand_path('../model_dumper', __FILE__)
22
+ require File::expand_path('../rule_dsl', __FILE__)
23
+
24
+ def dump_schema schema_file, rule_file
25
+ puts "preparing to generate schema.rb, the schema of source database\n"
26
+ ActiveRecord::Base.establish_connection(DataTransit::Database.source)
27
+ #tables = DataTransit::Database.tables
28
+ worker = DTWorker.new rule_file
29
+ worker.load_work
30
+ tables = worker.get_all_tables
31
+
32
+ print tables, "\n"
33
+
34
+ dumper = DataTransit::ModelDumper.new tables
35
+ dumper.dump_tables File.open(schema_file, 'w');
36
+ ActiveRecord::Base.remove_connection
37
+ puts "schema.rb generated\n"
38
+ end
39
+
40
+ def create_tables schema_file
41
+ puts "\npreparing to create tables in the target database\n"
42
+ ActiveRecord::Base.establish_connection(DataTransit::Database.target)
43
+ require schema_file
44
+ ActiveRecord::Base.remove_connection
45
+ puts "tables created in target database\n"
46
+ end
47
+
48
+
49
+ def copy_data rule_file
50
+ worker = DTWorker.new rule_file
51
+ worker.load_work
52
+ worker.do_work
53
+ end
54
+
55
+ def setup_db_conn db_yml
56
+ if File::exists? db_yml
57
+ src = File::open(db_yml, "r")
58
+ dst = File::open( File.expand_path('../../../database.yml', __FILE__), "w" )
59
+
60
+ src.each_line { |line|
61
+ dst.write line
62
+ }
63
+ src.close
64
+ dst.close
65
+ end
66
+ end
@@ -0,0 +1,21 @@
1
+ # To change this license header, choose License Headers in Project Properties.
2
+ # To change this template file, choose Tools | Templates
3
+ # and open the template in the editor.
4
+
5
+ module DataTransit
6
+
7
+ class Database
8
+
9
+ @@dbconfig = YAML::load( File.open( File.expand_path('../../../database.yml', __FILE__) ) )
10
+
11
+ def self.source
12
+ @@dbconfig['source']
13
+ end
14
+
15
+ def self.target
16
+ @@dbconfig['target']
17
+ end
18
+
19
+ end
20
+
21
+ end
@@ -0,0 +1,26 @@
1
+ # To change this license header, choose License Headers in Project Properties.
2
+ # To change this template file, choose Tools | Templates
3
+ # and open the template in the editor.
4
+
5
+ module TimeExt
6
+ def fly_by_day days
7
+ Time.at(self.to_i + days * 24 * 60 * 60)
8
+ end
9
+
10
+ def fly_by_week weeks
11
+ Time.at(self.to_i + weeks * 7 * 24 * 60 * 60)
12
+ end
13
+
14
+ def fly_by_month months
15
+ Time.at(self.to_i + months * 30 * 24 * 60 * 60)
16
+ end
17
+
18
+ def midnight
19
+ Time.at(self.year, self.month, self.day, 24, 0, 0 )
20
+ end
21
+
22
+ end
23
+
24
+ class Time
25
+ include TimeExt
26
+ end
@@ -0,0 +1,18 @@
1
+ # To change this license header, choose License Headers in Project Properties.
2
+ # To change this template file, choose Tools | Templates
3
+ # and open the template in the editor.
4
+
5
+ require 'active_record'
6
+ require File::expand_path('../../database', __FILE__)
7
+
8
+ module DataTransit
9
+
10
+ module Source
11
+
12
+ class SourceBase < ActiveRecord::Base
13
+ self.abstract_class = true
14
+ establish_connection DataTransit::Database.source
15
+ end
16
+ end
17
+
18
+ end
@@ -0,0 +1,18 @@
1
+ # To change this license header, choose License Headers in Project Properties.
2
+ # To change this template file, choose Tools | Templates
3
+ # and open the template in the editor.
4
+
5
+ require 'active_record'
6
+ require File::expand_path('../../database', __FILE__)
7
+
8
+ module DataTransit
9
+
10
+ module Target
11
+
12
+ class TargetBase < ActiveRecord::Base
13
+ self.abstract_class = true
14
+ establish_connection DataTransit::Database.target
15
+ end
16
+ end
17
+
18
+ end
@@ -0,0 +1,65 @@
1
+ # To change this license header, choose License Headers in Project Properties.
2
+ # To change this template file, choose Tools | Templates
3
+ # and open the template in the editor.
4
+
5
+ module DataTransit
6
+ class GivenTableDumper < ActiveRecord::SchemaDumper
7
+
8
+ def self.give_tables tables
9
+ @@tables = tables
10
+ end
11
+
12
+ def dump(stream)
13
+ =begin
14
+ if all_tables_exist?
15
+ header(stream)
16
+ tables(stream)
17
+ trailer(stream)
18
+ else
19
+ print "No schema generated, because some table[s] do not exist!\n"
20
+ end
21
+ =end
22
+ header(stream)
23
+ tables(stream)
24
+ trailer(stream)
25
+
26
+ stream
27
+ end
28
+
29
+ def tables(stream)
30
+ @@tables.each do |tbl|
31
+ table(tbl, stream)
32
+ end
33
+ end
34
+
35
+ private
36
+ def all_tables_exist?
37
+ all_tables = @connection.tables
38
+ print all_tables, '!!!!!!!!!!!!!!!!!!!!111'
39
+ tables_all_exist = true
40
+
41
+ @@tables.each do |tbl|
42
+ unless all_tables.include? tbl
43
+ print "table [", tbl, "] doesn't exist!\n"
44
+ tables_all_exist = false
45
+ end
46
+ end
47
+
48
+ tables_all_exist
49
+ end
50
+ end
51
+
52
+
53
+ class ModelDumper
54
+
55
+ def initialize table_list
56
+ @tables = table_list
57
+ end
58
+
59
+ def dump_tables (stream)
60
+ GivenTableDumper.give_tables @tables
61
+ GivenTableDumper.dump ActiveRecord::Base.connection, stream
62
+ end
63
+ end
64
+
65
+ end
@@ -0,0 +1,235 @@
1
+ # To change this license header, choose License Headers in Project Properties.
2
+ # To change this template file, choose Tools | Templates
3
+ # and open the template in the editor.
4
+
5
+ #chain_filter1 --> chain_filter2 --> chain_filter3
6
+
7
+ require 'singleton'
8
+ require 'progress_bar'
9
+
10
+ #require File::expand_path('../database', __FILE__)
11
+ require File::expand_path('../model/tables_source', __FILE__)
12
+ require File::expand_path('../model/tables_target', __FILE__)
13
+
14
+ module RULE_DSL
15
+
16
+ class Task
17
+ attr_accessor :tables, :search_cond, :pk, :pre_work, :post_work, :filter
18
+ end
19
+
20
+ class TaskSet
21
+ def initialize
22
+ @tasks = []
23
+ end
24
+
25
+ def push_back task
26
+ @tasks << task
27
+ end
28
+
29
+ def pop_head
30
+ @tasks.delete_at 0
31
+ end
32
+
33
+ def first_task
34
+ @tasks[0]
35
+ end
36
+
37
+ def last_task
38
+ @tasks[@tasks.length - 1]
39
+ end
40
+
41
+ def length
42
+ @tasks.length
43
+ end
44
+
45
+ def task_at idx
46
+ @tasks[idx]
47
+ end
48
+
49
+ def each
50
+ for i in 0 .. @tasks.length-1
51
+ yield @tasks[i]
52
+ end
53
+ end
54
+ end
55
+
56
+ def load_rule *rule_file
57
+ rule_file.each{ |file| self.instance_eval File.read(file), file }
58
+ end
59
+
60
+ def migrate &script
61
+ @taskset ||= TaskSet.new
62
+ task = Task.new
63
+ @taskset.push_back task
64
+ yield
65
+ end
66
+
67
+ def choose_table *tables
68
+ task = @taskset.last_task
69
+ task.tables = tables
70
+ end
71
+
72
+ def batch_by search_cond
73
+ task = @taskset.last_task
74
+ task.search_cond = search_cond
75
+ end
76
+
77
+ def register_primary_key *key
78
+ task = @taskset.last_task
79
+ task.pk = key.map(&:downcase)
80
+ end
81
+
82
+ def filter_out_with &filter
83
+ task = @taskset.last_task
84
+ task.filter = filter
85
+ end
86
+
87
+ def pre_work &pre
88
+ task = @taskset.last_task
89
+ task.pre_work = pre
90
+ end
91
+
92
+ def post_work &post
93
+ task = @taskset.last_task
94
+ task.post_work = post
95
+ end
96
+
97
+ def get_all_tables
98
+ tables = []
99
+ @taskset.each { |task| tables << task.tables }
100
+ tables.flatten!
101
+ end
102
+ end
103
+
104
+ class DTWorker
105
+ include RULE_DSL
106
+ attr_accessor :batch_size
107
+
108
+ def initialize rule_file
109
+ @rule_file = rule_file
110
+ @batch_size = 500
111
+ end
112
+
113
+ def load_work
114
+ load_rule @rule_file
115
+ end
116
+
117
+ def do_work
118
+ #we copy all or nothing. a mess is not welcome here
119
+ #on failures, transaction will rollback automatically
120
+ DataTransit::Target::TargetBase.transaction do
121
+ @taskset.each do |task|
122
+ do_task task
123
+ end
124
+ end
125
+ end
126
+
127
+ def do_task task
128
+ pks = task.pk#a context-free-variable in context switches
129
+ tables = task.tables
130
+
131
+ tables.each do |tbl|
132
+ sourceCls = Class.new(DataTransit::Source::SourceBase) do
133
+ self.table_name = tbl
134
+ end
135
+
136
+ #columns = sourceCls.columns.map(&:name).map(&:downcase)
137
+ columns = sourceCls.columns
138
+ pk_column = get_pk_column(columns, pks)
139
+ if pk_column != nil
140
+ pk, pk_type = pk_column.name, pk_column.type
141
+ else
142
+ pk, pk_type = nil, nil
143
+ end
144
+
145
+ sourceCls.instance_eval( "self.primary_key = \"#{pk}\"") if pk != nil
146
+
147
+ targetCls = Class.new(DataTransit::Target::TargetBase) do
148
+ self.table_name = tbl
149
+ end
150
+ targetCls.instance_eval( "self.primary_key = \"#{pk}\"") if pk != nil
151
+
152
+ print "\ntable ", tbl, ":\n"
153
+ do_user_ar_proc targetCls, task.pre_work if task.pre_work
154
+ do_batch_copy sourceCls, targetCls, task, pk, pk_type=="integer"
155
+ do_user_ar_proc targetCls, task.post_work if task.post_work
156
+ end
157
+ end
158
+
159
+ def do_user_ar_proc targetCls, proc
160
+ proc.call targetCls
161
+ end
162
+
163
+ def do_batch_copy (sourceCls, targetCls, task, pk = nil, in_batch = false)
164
+ count = sourceCls.where(task.search_cond).size.to_f
165
+ return if count <= 0
166
+
167
+ how_many_batch = (count / @batch_size).ceil
168
+ #the progress bar
169
+ bar = ProgressBar.new(count)
170
+
171
+ if in_batch
172
+ 0.upto (how_many_batch-1) do |i|
173
+ sourceCls.where(task.search_cond).find_each(
174
+ start: i * @batch_size, batch_size: @batch_size) do |source_row|
175
+
176
+ #update progress
177
+ bar.increment!
178
+
179
+ if task.filter
180
+ next if do_filter_out task.filter, source_row
181
+ end
182
+ target_row = targetCls.new source_row.attributes
183
+
184
+ #activerecord would ignore pk field, and the above initialization will result nill primary key.
185
+ #here the original pk is used in the target_row, it is what we need exactly.
186
+ if pk
187
+ target_row.send( "#{pk}=", source_row.send("#{pk}") )
188
+ end
189
+
190
+ target_row.save
191
+ end
192
+ end
193
+ else
194
+ sourceCls.where(task.search_cond).each do |source_row|
195
+ #update progress
196
+ bar.increment!
197
+
198
+ if task.filter
199
+ next if do_filter_out task.filter, source_row
200
+ end
201
+ target_row = targetCls.new source_row.attributes
202
+
203
+ #activerecord would ignore pk field, and the above initialization will result nill primary key.
204
+ #here the original pk is used in the target_row, it is what we need exactly.
205
+ if pk
206
+ target_row.send( "#{pk}=", source_row.send("#{pk}") )
207
+ end
208
+
209
+ target_row.save
210
+ end
211
+ end
212
+
213
+
214
+ end
215
+
216
+ def do_filter_out filter, row
217
+ if filter.call row
218
+ return true
219
+ end
220
+ false
221
+ end
222
+
223
+ private
224
+ def get_pk_column(columns, given_pk)
225
+ column_names = columns.map(&:name).map(&:downcase)
226
+ pk = column_names & given_pk
227
+ if pk && pk.length > 0
228
+ pk = pk[0]
229
+ return columns[column_names.index(pk)]
230
+ else
231
+ return nil
232
+ end
233
+ end
234
+
235
+ end
@@ -0,0 +1,71 @@
1
+ # To change this license header, choose License Headers in Project Properties.
2
+ # To change this template file, choose Tools | Templates
3
+ # and open the template in the editor.
4
+ require 'rubygems'
5
+ require 'bundler/setup'
6
+ Bundler.require
7
+
8
+ #initial code
9
+ ENV['NLS_LANG'] = 'SIMPLIFIED CHINESE_CHINA.ZHS16GBK' if ENV['NLS_LANG'] == nil
10
+ # It is recommended to set time zone in TZ environment variable so that the same timezone will be used by Ruby and by Oracle session
11
+ ENV['TZ'] = 'UTC'
12
+
13
+
14
+ namespace :db do
15
+
16
+ desc "environment verification"
17
+ task :environment do
18
+ require 'active_record'
19
+ #require 'activerecord-oracle_enhanced-adapter'
20
+ #require 'oci8'
21
+
22
+ require File::expand_path('../database', __FILE__)
23
+ require File::expand_path('../model/tables_source', __FILE__)
24
+ require File::expand_path('../model/tables_target', __FILE__)
25
+ require File::expand_path('../model_dumper', __FILE__)
26
+ end
27
+
28
+ desc "dsl related definitions"
29
+ task :dsl => :environment do
30
+ require File::expand_path('../rule_dsl', __FILE__)
31
+ end
32
+
33
+ desc "generate schema.rb, the schema of source database"
34
+ task :dump_schema => :dsl do
35
+ puts "preparing to generate schema.rb, the schema of source database\n"
36
+ ActiveRecord::Base.establish_connection(DataTransit::Database.source)
37
+ #tables = DataTransit::Database.tables
38
+ worker = DTWorker.new File::expand_path('../../rule.rb', __FILE__)
39
+ worker.load_work
40
+ tables = worker.tables
41
+
42
+ print tables, "\n"
43
+
44
+ dumper = DataTransit::ModelDumper.new tables
45
+ dumper.dump_tables File.open(File::expand_path('../schema.rb', __FILE__), 'w');
46
+ ActiveRecord::Base.remove_connection
47
+ puts "schema.rb generated\n"
48
+ end
49
+
50
+ desc "use schema.rb to schema created in target database"
51
+ task :create_tables => :environment do
52
+ puts "\npreparing to create tables in the target database\n"
53
+ ActiveRecord::Base.establish_connection(DataTransit::Database.target)
54
+ require File.expand_path("../schema", __FILE__)
55
+ ActiveRecord::Base.remove_connection
56
+ puts "tables created in target database\n"
57
+ end
58
+
59
+ desc "data transit, copy rows from source db tables to target db tables\n it supports incremental copy by additional arguments"
60
+ task :copy_data, [] => :dsl do
61
+ worker = DTWorker.new File::expand_path('../../rule.rb', __FILE__)
62
+ worker.do_work
63
+ end
64
+
65
+ #task :my, [:arg1, :arg2, :arg3, :arg4] => :environment do |t, args|
66
+ #print t, args
67
+ #tables = DataTransit::Database.tables
68
+ #print tables
69
+ #end
70
+
71
+ end
metadata ADDED
@@ -0,0 +1,66 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: data_transit
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.2.0
5
+ prerelease:
6
+ platform: ruby
7
+ authors:
8
+ - thundercumt
9
+ autorequire:
10
+ bindir: bin
11
+ cert_chain: []
12
+ date: 2015-03-12 00:00:00.000000000 Z
13
+ dependencies: []
14
+ description: data_transit relies on activerecord to generate database Models on the
15
+ fly. Tt is executed within a database transaction, and should any error occur during
16
+ data transit, it will cause the transaction to rollback. So don't worry about introducing
17
+ dirty data into your target database
18
+ email: thundercumt@126.com
19
+ executables:
20
+ - data_transit
21
+ extensions: []
22
+ extra_rdoc_files:
23
+ - README
24
+ - LICENSE
25
+ files:
26
+ - LICENSE
27
+ - README
28
+ - Rakefile
29
+ - Database.yml
30
+ - bin/data_transit
31
+ - bin/data_transit.bat
32
+ - lib/datatransit/clean_find_in_batches.rb
33
+ - lib/datatransit/cli.rb
34
+ - lib/datatransit/database.rb
35
+ - lib/datatransit/helper.rb
36
+ - lib/datatransit/model/tables_source.rb
37
+ - lib/datatransit/model/tables_target.rb
38
+ - lib/datatransit/model_dumper.rb
39
+ - lib/datatransit/rule_dsl.rb
40
+ - lib/datatransit/tasks.rb
41
+ homepage: https://github.com/thundercumt/data_transit
42
+ licenses: []
43
+ post_install_message:
44
+ rdoc_options: []
45
+ require_paths:
46
+ - lib
47
+ required_ruby_version: !ruby/object:Gem::Requirement
48
+ none: false
49
+ requirements:
50
+ - - ! '>='
51
+ - !ruby/object:Gem::Version
52
+ version: '0'
53
+ required_rubygems_version: !ruby/object:Gem::Requirement
54
+ none: false
55
+ requirements:
56
+ - - ! '>='
57
+ - !ruby/object:Gem::Version
58
+ version: '0'
59
+ requirements: []
60
+ rubyforge_project:
61
+ rubygems_version: 1.8.28
62
+ signing_key:
63
+ specification_version: 3
64
+ summary: a ruby gem/app used to migrate between databases, supporting customized migration
65
+ procedure
66
+ test_files: []