data_transit 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/Database.yml ADDED
@@ -0,0 +1,27 @@
1
+ source:
2
+ adapter: oracle_enhanced
3
+ host: 10.162.211.9
4
+ port: 1521
5
+ database: ztjc2
6
+ username: mw_ztjc
7
+ password: ztjc
8
+
9
+ target:
10
+ adapter: oracle_enhanced
11
+ host: 10.162.106.153
12
+ port: 1521
13
+ database: orcl
14
+ username: mw_app
15
+ password: mw_app
16
+
17
+ test_target:
18
+ adapter: sqlite3
19
+ database: development.sqlite3
20
+ pool: 5
21
+ timeout: 5000
22
+
23
+ production:
24
+ adapter: oracle
25
+ database: comics_catalog_production
26
+ username: comics_catalog
27
+ password:
data/LICENSE ADDED
@@ -0,0 +1,22 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2015 tianyuan
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
22
+
data/README ADDED
@@ -0,0 +1,150 @@
1
+ data_transit
2
+ data_transit is a ruby gem/app used to migrate between different relational
3
+ databases, supporting customized migration procedure.
4
+
5
+
6
+ #1 Introduction
7
+ data_transit relies on activerecord to generate database Models on the fly.
8
+ Tt is executed within a database transaction, and should any error occur during
9
+ data transit, it will cause the transaction to rollback. So don't worry about
10
+ introducing dirty data into your target database.
11
+
12
+
13
+ #2 Install
14
+ data_transit can be installed using gem
15
+ gem install data_transit
16
+ or
17
+ download data_transit.gem
18
+ gem install --local /where/you/put/data_transit.gem
19
+
20
+ In your command line, input data_transit, You can proceed to "how to use" if you
21
+ can see message prompt as below.
22
+ data_transit
23
+ usage: data_transit command args. 4 commands are listed below...
24
+
25
+
26
+ #3 How to use
27
+
28
+ #3.1 Config DataBase Connection
29
+ data_transit setup_db_conn /your/yaml/database/config/file
30
+
31
+ Your db config file should be compatible with the activerecord adapter you are
32
+ using and configured properly. As the name suggests, source designates which
33
+ database you plan to copy data from, while target means the copy destination.
34
+ Note the key 'source' and 'target' should be kept unchanged!
35
+ For example, here is sample file for oracle dbms.
36
+
37
+ database.yml
38
+ source:#don't change this line
39
+ adapter: oracle_enhanced
40
+ database: dbserver
41
+ username: xxx
42
+ password: secret
43
+
44
+ target:#don't change this line
45
+ adapter: oracle_enhanced
46
+ database: orcl
47
+ username: copy1
48
+ password: cipher
49
+
50
+
51
+ #3.2 Create Database Schema(optional)
52
+ If you can have your target database schema created, move to 3.3 "copy data",
53
+ otherwise use your database specific tools to generate an identical schema in
54
+ your target database. Or if you don't have a handy tool to generate the schema,
55
+ for instance when you need to migrate between different database systems, you
56
+ can use data_transit to dump a schema description file based on your source
57
+ database schema, and then use this file to create your target schema.
58
+
59
+ Note this gem is built on activerecord, so it should work well for the database
60
+ schema compatible with rails conventions. Example, a single-column primary key
61
+ rather than a compound primary key (primary key with more than one columns),
62
+ primary key of integer type instead of other types like guid, timestamp etc.
63
+
64
+ In data_transit, I coded against the situation where non-integer is used,
65
+ therefore resulting a minor problem that the batch-query feature provided by
66
+ activerecord can not be used because of its dependency on integer primary key.
67
+ In this special case, careless selection of copy range might overburden network,
68
+ database server because all data in the specified range are transmitted from the
69
+ source database and then inserted into the target database.
70
+
71
+
72
+ #3.2.1 Dump Source Database Schema
73
+ data_transit dump_schema [schema_file] [rule_file]
74
+ [schema_file] will be used to contain dumped schema, [rule_file] describes how
75
+ your want data_transit to copy your data.
76
+
77
+ Note if your source schema is somewhat a legacy, you might need some manual work
78
+ to adjust the generated schema_file to better meet your needs.
79
+
80
+ For example, in my test, the source schema uses obj_id as id, and uses guid as
81
+ primary key type, so I need to impede activerecord auto-generating "id" column
82
+ by removing primary_key => "obj_id" and adding :id=>false, and then appending
83
+ primary key constraint in the end of each table definition. See below.
84
+
85
+ here is an example dumped schema file
86
+
87
+ ActiveRecord::Schema.define(:version => 0) do
88
+ create_table "table_one", primary_key => "obj_id", :force => true do |t|
89
+ #other fields
90
+ end
91
+ #other tables
92
+ end
93
+
94
+ and I manually changed the schema definition to
95
+
96
+ ActiveRecord::Schema.define(:version => 0) do
97
+ create_table "table_one", :id => false, :force => true do |t|
98
+ t.string "obj_id", :limit => 42
99
+ #other fields
100
+ end
101
+ execute "alter table table_one add primary key(obj_id)"
102
+ end
103
+
104
+ #3.2.2 Create Target Database Schema
105
+ data_transit create_table [schema_file]
106
+ If everything goes well, you will see a bunch of ddl execution history.
107
+
108
+
109
+ #3.3 Copy Data
110
+ data_transit copy_data [rule_file]
111
+ [rule_file] contains your copy logic. For security reasons, I changed table names
112
+ and it looks as follows.
113
+
114
+
115
+ start_date = "2015-01-01 00:00:00"
116
+ end_date = "2015-02-01 00:00:00"
117
+
118
+ migrate do
119
+ choose_table "APP.TABLE1","APP.TABLE2","APP.TABLE3","APP.TABLE4","APP.TABLE5","APP.TABLE6"
120
+ batch_by "ACQUISITION_TIME BETWEEN TO_DATE('#{start_date}','yyyy-mm-dd hh24:mi:ss') AND TO_DATE('#{end_date}', 'yyyy-mm-dd hh24:mi:ss')"
121
+ register_primary_key 'OBJ_ID'
122
+ end
123
+
124
+ migrate do
125
+ choose_table "APP.TABLE7","APP.TABLE8","APP.TABLE9","APP.TABLE10","APP.TABLE11","APP.TABLE12"
126
+ batch_by "ACKTIME BETWEEN TO_DATE('#{start_date}','yyyy-mm-dd hh24:mi:ss') AND TO_DATE('#{end_date}', 'yyyy-mm-dd hh24:mi:ss')"
127
+ register_primary_key 'OBJ_ID'
128
+ end
129
+
130
+ migrate do
131
+ choose_table "APP.TABLE13","APP.TABLE14","APP.TABLE15","APP.TABLE16","APP.TABLE17","APP.TABLE18"
132
+ batch_by "1>0" #query all data because these tables don't have a reasonable range
133
+ register_primary_key 'OBJ_ID'
134
+ pre_work do |targetCls| targetCls.delete_all("1>0") end #delete all in target
135
+ #post_work do |targetCls| end
136
+ end
137
+
138
+
139
+ Each migrate block contains a data_transit task.
140
+
141
+ "choose_table" describes which tables are included in this task. These tables
142
+ share some nature in common, and can be processed with the same rule.
143
+
144
+ "batch_by" is the query condition
145
+
146
+ "register_primary_key" describes the primary key of the tables.
147
+
148
+ "pre_work" is a block executed before each table is processed.
149
+
150
+ "post_work" is a block executed after each table is processed.
data/Rakefile ADDED
@@ -0,0 +1,47 @@
1
+ #
2
+ # To change this license header, choose License Headers in Project Properties.
3
+ # To change this template file, choose Tools | Templates
4
+ # and open the template in the editor.
5
+
6
+
7
+ require 'rubygems'
8
+ require 'rake'
9
+ require 'rake/clean'
10
+ require 'rubygems/package_task'
11
+ require 'rdoc/task'
12
+ require 'rake/testtask'
13
+
14
+ spec = Gem::Specification.new do |s|
15
+ s.name = 'data_transit'
16
+ s.version = '0.2.0'
17
+ s.has_rdoc = true
18
+ s.extra_rdoc_files = ['README', 'LICENSE']
19
+ s.summary = 'a ruby gem/app used to migrate between databases, supporting customized migration procedure'
20
+ s.description = 'data_transit relies on activerecord to generate database Models on the fly. Tt is executed within a database transaction, and should any error occur during data transit, it will cause the transaction to rollback. So don\'t worry about introducing dirty data into your target database'
21
+ s.author = 'thundercumt'
22
+ s.email = 'thundercumt@126.com'
23
+ s.homepage = 'https://github.com/thundercumt/data_transit'
24
+ s.executables << "data_transit"
25
+ s.files = %w(LICENSE README Rakefile Database.yml) + Dir.glob("{bin,lib,spec}/**/*")
26
+ s.require_path = "lib"
27
+ s.bindir = "bin"
28
+ end
29
+
30
+ Gem::PackageTask.new(spec) do |p|
31
+ p.gem_spec = spec
32
+ p.need_tar = true
33
+ p.need_zip = true
34
+ end
35
+
36
+ Rake::RDocTask.new do |rdoc|
37
+ files =['README', 'LICENSE', 'lib/**/*.rb']
38
+ rdoc.rdoc_files.add(files)
39
+ rdoc.main = "README" # page to start on
40
+ rdoc.title = "Data_Transit Docs"
41
+ rdoc.rdoc_dir = 'doc/rdoc' # rdoc output folder
42
+ rdoc.options << '--line-numbers'
43
+ end
44
+
45
+ Rake::TestTask.new do |t|
46
+ t.test_files = FileList['test/**/*.rb']
47
+ end
data/bin/data_transit ADDED
@@ -0,0 +1,40 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require File::expand_path('../../lib/datatransit/cli', __FILE__)
4
+
5
+ command = ARGV[0]
6
+
7
+ help_msg = "usage: data_transit command args. 4 commands are listed below\n
8
+ 1 data_transit setup_db_conn [conn.yml]
9
+ conn.yml: the yml file describing db connection and adapters\n
10
+ 2 data_transit dump_schema [schema_file] [rule_file]
11
+ schema_file: the file to store source database schema
12
+ rule_file: the user migration rule\n
13
+ 3 data_transit create_table [schema_file]
14
+ schema_file: the file generated by dump_schema\n
15
+ 4 data_transit copy_data [rule_file]
16
+ rule_file: the user migration rule"
17
+
18
+ case command
19
+ when "setup_db_conn"
20
+ yml = ARGV[1]
21
+ if File::exists?(yml)
22
+ setup_db_conn yml
23
+ else
24
+ print "file #{yml} does not exist!\n"
25
+ end
26
+
27
+ when "dump_schema"
28
+ schema = ARGV[1]
29
+ rule = ARGV[2]
30
+ if not (schema =~ /\.rb$/)
31
+ schema += ".rb"
32
+ end
33
+ dump_schema schema, rule
34
+ when "create_table"
35
+ create_tables ARGV[1]
36
+ when "copy_data"
37
+ copy_data ARGV[1]
38
+ else
39
+ puts help_msg
40
+ end
@@ -0,0 +1,6 @@
1
+ @ECHO OFF
2
+ IF NOT "%~f0" == "~f0" GOTO :WinNT
3
+ @"ruby.exe" "data_transit" %1 %2 %3 %4 %5 %6 %7 %8 %9
4
+ GOTO :EOF
5
+ :WinNT
6
+ @"ruby.exe" "data_transit" %*
@@ -0,0 +1,34 @@
1
+ module CleanFindInBatches
2
+
3
+ def self.included(base)
4
+ base.class_eval do
5
+ alias :old_find_in_batches :find_in_batches
6
+ alias :find_in_batches :replacement_find_in_batches
7
+ end
8
+ end
9
+
10
+ # Override due to implementation of regular find_in_batches
11
+ # conflicting using UUIDs
12
+ def replacement_find_in_batches(options = {}, &block)
13
+ relation = self
14
+ return old_find_in_batches(options, &block) if relation.primary_key.is_a?(Arel::Attributes::Integer)
15
+ # Throw errors like the real thing
16
+ if (finder_options = options.except(:batch_size)).present?
17
+ raise "You can't specify an order, it's forced to be #{batch_order}" if options[:order].present?
18
+ raise "You can't specify a limit, it's forced to be the batch_size" if options[:limit].present?
19
+ raise 'You can\'t specify start, it\'s forced to be 0 because the ID is a string' if options.delete(:start)
20
+ relation = apply_finder_options(finder_options)
21
+ end
22
+ # Compute the batch size
23
+ batch_size = options.delete(:batch_size) || 1000
24
+ offset = 0
25
+ # Get the relation and keep going over it until there's nothing left
26
+ relation = relation.except(:order).order(batch_order).limit(batch_size)
27
+ while (results = relation.offset(offset).limit(batch_size).all).any?
28
+ block.call results
29
+ offset += batch_size
30
+ end
31
+ nil
32
+ end
33
+
34
+ end
@@ -0,0 +1,66 @@
1
+ # To change this license header, choose License Headers in Project Properties.
2
+ # To change this template file, choose Tools | Templates
3
+ # and open the template in the editor.
4
+
5
+ #require 'rubygems'
6
+ #require 'bundler/setup'
7
+ #Bundler.require
8
+
9
+ #initial code
10
+ ENV['NLS_LANG'] = 'SIMPLIFIED CHINESE_CHINA.ZHS16GBK' if ENV['NLS_LANG'] == nil
11
+ # It is recommended to set time zone in TZ environment variable so that the same timezone will be used by Ruby and by Oracle session
12
+ ENV['TZ'] = 'UTC'
13
+
14
+ require 'active_record'
15
+ require File::expand_path('../clean_find_in_batches', __FILE__)
16
+ ActiveRecord::Batches.send(:include, CleanFindInBatches)
17
+
18
+ require File::expand_path('../database', __FILE__)
19
+ require File::expand_path('../model/tables_source', __FILE__)
20
+ require File::expand_path('../model/tables_target', __FILE__)
21
+ require File::expand_path('../model_dumper', __FILE__)
22
+ require File::expand_path('../rule_dsl', __FILE__)
23
+
24
+ def dump_schema schema_file, rule_file
25
+ puts "preparing to generate schema.rb, the schema of source database\n"
26
+ ActiveRecord::Base.establish_connection(DataTransit::Database.source)
27
+ #tables = DataTransit::Database.tables
28
+ worker = DTWorker.new rule_file
29
+ worker.load_work
30
+ tables = worker.get_all_tables
31
+
32
+ print tables, "\n"
33
+
34
+ dumper = DataTransit::ModelDumper.new tables
35
+ dumper.dump_tables File.open(schema_file, 'w');
36
+ ActiveRecord::Base.remove_connection
37
+ puts "schema.rb generated\n"
38
+ end
39
+
40
+ def create_tables schema_file
41
+ puts "\npreparing to create tables in the target database\n"
42
+ ActiveRecord::Base.establish_connection(DataTransit::Database.target)
43
+ require schema_file
44
+ ActiveRecord::Base.remove_connection
45
+ puts "tables created in target database\n"
46
+ end
47
+
48
+
49
+ def copy_data rule_file
50
+ worker = DTWorker.new rule_file
51
+ worker.load_work
52
+ worker.do_work
53
+ end
54
+
55
+ def setup_db_conn db_yml
56
+ if File::exists? db_yml
57
+ src = File::open(db_yml, "r")
58
+ dst = File::open( File.expand_path('../../../database.yml', __FILE__), "w" )
59
+
60
+ src.each_line { |line|
61
+ dst.write line
62
+ }
63
+ src.close
64
+ dst.close
65
+ end
66
+ end
@@ -0,0 +1,21 @@
1
+ # To change this license header, choose License Headers in Project Properties.
2
+ # To change this template file, choose Tools | Templates
3
+ # and open the template in the editor.
4
+
5
+ module DataTransit
6
+
7
+ class Database
8
+
9
+ @@dbconfig = YAML::load( File.open( File.expand_path('../../../database.yml', __FILE__) ) )
10
+
11
+ def self.source
12
+ @@dbconfig['source']
13
+ end
14
+
15
+ def self.target
16
+ @@dbconfig['target']
17
+ end
18
+
19
+ end
20
+
21
+ end
@@ -0,0 +1,26 @@
1
+ # To change this license header, choose License Headers in Project Properties.
2
+ # To change this template file, choose Tools | Templates
3
+ # and open the template in the editor.
4
+
5
+ module TimeExt
6
+ def fly_by_day days
7
+ Time.at(self.to_i + days * 24 * 60 * 60)
8
+ end
9
+
10
+ def fly_by_week weeks
11
+ Time.at(self.to_i + weeks * 7 * 24 * 60 * 60)
12
+ end
13
+
14
+ def fly_by_month months
15
+ Time.at(self.to_i + months * 30 * 24 * 60 * 60)
16
+ end
17
+
18
+ def midnight
19
+ Time.at(self.year, self.month, self.day, 24, 0, 0 )
20
+ end
21
+
22
+ end
23
+
24
+ class Time
25
+ include TimeExt
26
+ end
@@ -0,0 +1,18 @@
1
+ # To change this license header, choose License Headers in Project Properties.
2
+ # To change this template file, choose Tools | Templates
3
+ # and open the template in the editor.
4
+
5
+ require 'active_record'
6
+ require File::expand_path('../../database', __FILE__)
7
+
8
+ module DataTransit
9
+
10
+ module Source
11
+
12
+ class SourceBase < ActiveRecord::Base
13
+ self.abstract_class = true
14
+ establish_connection DataTransit::Database.source
15
+ end
16
+ end
17
+
18
+ end
@@ -0,0 +1,18 @@
1
+ # To change this license header, choose License Headers in Project Properties.
2
+ # To change this template file, choose Tools | Templates
3
+ # and open the template in the editor.
4
+
5
+ require 'active_record'
6
+ require File::expand_path('../../database', __FILE__)
7
+
8
+ module DataTransit
9
+
10
+ module Target
11
+
12
+ class TargetBase < ActiveRecord::Base
13
+ self.abstract_class = true
14
+ establish_connection DataTransit::Database.target
15
+ end
16
+ end
17
+
18
+ end
@@ -0,0 +1,65 @@
1
+ # To change this license header, choose License Headers in Project Properties.
2
+ # To change this template file, choose Tools | Templates
3
+ # and open the template in the editor.
4
+
5
+ module DataTransit
6
+ class GivenTableDumper < ActiveRecord::SchemaDumper
7
+
8
+ def self.give_tables tables
9
+ @@tables = tables
10
+ end
11
+
12
+ def dump(stream)
13
+ =begin
14
+ if all_tables_exist?
15
+ header(stream)
16
+ tables(stream)
17
+ trailer(stream)
18
+ else
19
+ print "No schema generated, because some table[s] do not exist!\n"
20
+ end
21
+ =end
22
+ header(stream)
23
+ tables(stream)
24
+ trailer(stream)
25
+
26
+ stream
27
+ end
28
+
29
+ def tables(stream)
30
+ @@tables.each do |tbl|
31
+ table(tbl, stream)
32
+ end
33
+ end
34
+
35
+ private
36
+ def all_tables_exist?
37
+ all_tables = @connection.tables
38
+ print all_tables, '!!!!!!!!!!!!!!!!!!!!111'
39
+ tables_all_exist = true
40
+
41
+ @@tables.each do |tbl|
42
+ unless all_tables.include? tbl
43
+ print "table [", tbl, "] doesn't exist!\n"
44
+ tables_all_exist = false
45
+ end
46
+ end
47
+
48
+ tables_all_exist
49
+ end
50
+ end
51
+
52
+
53
+ class ModelDumper
54
+
55
+ def initialize table_list
56
+ @tables = table_list
57
+ end
58
+
59
+ def dump_tables (stream)
60
+ GivenTableDumper.give_tables @tables
61
+ GivenTableDumper.dump ActiveRecord::Base.connection, stream
62
+ end
63
+ end
64
+
65
+ end
@@ -0,0 +1,235 @@
1
+ # To change this license header, choose License Headers in Project Properties.
2
+ # To change this template file, choose Tools | Templates
3
+ # and open the template in the editor.
4
+
5
+ #chain_filter1 --> chain_filter2 --> chain_filter3
6
+
7
+ require 'singleton'
8
+ require 'progress_bar'
9
+
10
+ #require File::expand_path('../database', __FILE__)
11
+ require File::expand_path('../model/tables_source', __FILE__)
12
+ require File::expand_path('../model/tables_target', __FILE__)
13
+
14
+ module RULE_DSL
15
+
16
+ class Task
17
+ attr_accessor :tables, :search_cond, :pk, :pre_work, :post_work, :filter
18
+ end
19
+
20
+ class TaskSet
21
+ def initialize
22
+ @tasks = []
23
+ end
24
+
25
+ def push_back task
26
+ @tasks << task
27
+ end
28
+
29
+ def pop_head
30
+ @tasks.delete_at 0
31
+ end
32
+
33
+ def first_task
34
+ @tasks[0]
35
+ end
36
+
37
+ def last_task
38
+ @tasks[@tasks.length - 1]
39
+ end
40
+
41
+ def length
42
+ @tasks.length
43
+ end
44
+
45
+ def task_at idx
46
+ @tasks[idx]
47
+ end
48
+
49
+ def each
50
+ for i in 0 .. @tasks.length-1
51
+ yield @tasks[i]
52
+ end
53
+ end
54
+ end
55
+
56
+ def load_rule *rule_file
57
+ rule_file.each{ |file| self.instance_eval File.read(file), file }
58
+ end
59
+
60
+ def migrate &script
61
+ @taskset ||= TaskSet.new
62
+ task = Task.new
63
+ @taskset.push_back task
64
+ yield
65
+ end
66
+
67
+ def choose_table *tables
68
+ task = @taskset.last_task
69
+ task.tables = tables
70
+ end
71
+
72
+ def batch_by search_cond
73
+ task = @taskset.last_task
74
+ task.search_cond = search_cond
75
+ end
76
+
77
+ def register_primary_key *key
78
+ task = @taskset.last_task
79
+ task.pk = key.map(&:downcase)
80
+ end
81
+
82
+ def filter_out_with &filter
83
+ task = @taskset.last_task
84
+ task.filter = filter
85
+ end
86
+
87
+ def pre_work &pre
88
+ task = @taskset.last_task
89
+ task.pre_work = pre
90
+ end
91
+
92
+ def post_work &post
93
+ task = @taskset.last_task
94
+ task.post_work = post
95
+ end
96
+
97
+ def get_all_tables
98
+ tables = []
99
+ @taskset.each { |task| tables << task.tables }
100
+ tables.flatten!
101
+ end
102
+ end
103
+
104
+ class DTWorker
105
+ include RULE_DSL
106
+ attr_accessor :batch_size
107
+
108
+ def initialize rule_file
109
+ @rule_file = rule_file
110
+ @batch_size = 500
111
+ end
112
+
113
+ def load_work
114
+ load_rule @rule_file
115
+ end
116
+
117
+ def do_work
118
+ #we copy all or nothing. a mess is not welcome here
119
+ #on failures, transaction will rollback automatically
120
+ DataTransit::Target::TargetBase.transaction do
121
+ @taskset.each do |task|
122
+ do_task task
123
+ end
124
+ end
125
+ end
126
+
127
+ def do_task task
128
+ pks = task.pk#a context-free-variable in context switches
129
+ tables = task.tables
130
+
131
+ tables.each do |tbl|
132
+ sourceCls = Class.new(DataTransit::Source::SourceBase) do
133
+ self.table_name = tbl
134
+ end
135
+
136
+ #columns = sourceCls.columns.map(&:name).map(&:downcase)
137
+ columns = sourceCls.columns
138
+ pk_column = get_pk_column(columns, pks)
139
+ if pk_column != nil
140
+ pk, pk_type = pk_column.name, pk_column.type
141
+ else
142
+ pk, pk_type = nil, nil
143
+ end
144
+
145
+ sourceCls.instance_eval( "self.primary_key = \"#{pk}\"") if pk != nil
146
+
147
+ targetCls = Class.new(DataTransit::Target::TargetBase) do
148
+ self.table_name = tbl
149
+ end
150
+ targetCls.instance_eval( "self.primary_key = \"#{pk}\"") if pk != nil
151
+
152
+ print "\ntable ", tbl, ":\n"
153
+ do_user_ar_proc targetCls, task.pre_work if task.pre_work
154
+ do_batch_copy sourceCls, targetCls, task, pk, pk_type=="integer"
155
+ do_user_ar_proc targetCls, task.post_work if task.post_work
156
+ end
157
+ end
158
+
159
+ def do_user_ar_proc targetCls, proc
160
+ proc.call targetCls
161
+ end
162
+
163
+ def do_batch_copy (sourceCls, targetCls, task, pk = nil, in_batch = false)
164
+ count = sourceCls.where(task.search_cond).size.to_f
165
+ return if count <= 0
166
+
167
+ how_many_batch = (count / @batch_size).ceil
168
+ #the progress bar
169
+ bar = ProgressBar.new(count)
170
+
171
+ if in_batch
172
+ 0.upto (how_many_batch-1) do |i|
173
+ sourceCls.where(task.search_cond).find_each(
174
+ start: i * @batch_size, batch_size: @batch_size) do |source_row|
175
+
176
+ #update progress
177
+ bar.increment!
178
+
179
+ if task.filter
180
+ next if do_filter_out task.filter, source_row
181
+ end
182
+ target_row = targetCls.new source_row.attributes
183
+
184
+ #activerecord would ignore pk field, and the above initialization will result nill primary key.
185
+ #here the original pk is used in the target_row, it is what we need exactly.
186
+ if pk
187
+ target_row.send( "#{pk}=", source_row.send("#{pk}") )
188
+ end
189
+
190
+ target_row.save
191
+ end
192
+ end
193
+ else
194
+ sourceCls.where(task.search_cond).each do |source_row|
195
+ #update progress
196
+ bar.increment!
197
+
198
+ if task.filter
199
+ next if do_filter_out task.filter, source_row
200
+ end
201
+ target_row = targetCls.new source_row.attributes
202
+
203
+ #activerecord would ignore pk field, and the above initialization will result nill primary key.
204
+ #here the original pk is used in the target_row, it is what we need exactly.
205
+ if pk
206
+ target_row.send( "#{pk}=", source_row.send("#{pk}") )
207
+ end
208
+
209
+ target_row.save
210
+ end
211
+ end
212
+
213
+
214
+ end
215
+
216
+ def do_filter_out filter, row
217
+ if filter.call row
218
+ return true
219
+ end
220
+ false
221
+ end
222
+
223
+ private
224
+ def get_pk_column(columns, given_pk)
225
+ column_names = columns.map(&:name).map(&:downcase)
226
+ pk = column_names & given_pk
227
+ if pk && pk.length > 0
228
+ pk = pk[0]
229
+ return columns[column_names.index(pk)]
230
+ else
231
+ return nil
232
+ end
233
+ end
234
+
235
+ end
@@ -0,0 +1,71 @@
1
+ # To change this license header, choose License Headers in Project Properties.
2
+ # To change this template file, choose Tools | Templates
3
+ # and open the template in the editor.
4
+ require 'rubygems'
5
+ require 'bundler/setup'
6
+ Bundler.require
7
+
8
+ #initial code
9
+ ENV['NLS_LANG'] = 'SIMPLIFIED CHINESE_CHINA.ZHS16GBK' if ENV['NLS_LANG'] == nil
10
+ # It is recommended to set time zone in TZ environment variable so that the same timezone will be used by Ruby and by Oracle session
11
+ ENV['TZ'] = 'UTC'
12
+
13
+
14
+ namespace :db do
15
+
16
+ desc "environment verification"
17
+ task :environment do
18
+ require 'active_record'
19
+ #require 'activerecord-oracle_enhanced-adapter'
20
+ #require 'oci8'
21
+
22
+ require File::expand_path('../database', __FILE__)
23
+ require File::expand_path('../model/tables_source', __FILE__)
24
+ require File::expand_path('../model/tables_target', __FILE__)
25
+ require File::expand_path('../model_dumper', __FILE__)
26
+ end
27
+
28
+ desc "dsl related definitions"
29
+ task :dsl => :environment do
30
+ require File::expand_path('../rule_dsl', __FILE__)
31
+ end
32
+
33
+ desc "generate schema.rb, the schema of source database"
34
+ task :dump_schema => :dsl do
35
+ puts "preparing to generate schema.rb, the schema of source database\n"
36
+ ActiveRecord::Base.establish_connection(DataTransit::Database.source)
37
+ #tables = DataTransit::Database.tables
38
+ worker = DTWorker.new File::expand_path('../../rule.rb', __FILE__)
39
+ worker.load_work
40
+ tables = worker.tables
41
+
42
+ print tables, "\n"
43
+
44
+ dumper = DataTransit::ModelDumper.new tables
45
+ dumper.dump_tables File.open(File::expand_path('../schema.rb', __FILE__), 'w');
46
+ ActiveRecord::Base.remove_connection
47
+ puts "schema.rb generated\n"
48
+ end
49
+
50
+ desc "use schema.rb to schema created in target database"
51
+ task :create_tables => :environment do
52
+ puts "\npreparing to create tables in the target database\n"
53
+ ActiveRecord::Base.establish_connection(DataTransit::Database.target)
54
+ require File.expand_path("../schema", __FILE__)
55
+ ActiveRecord::Base.remove_connection
56
+ puts "tables created in target database\n"
57
+ end
58
+
59
+ desc "data transit, copy rows from source db tables to target db tables\n it supports incremental copy by additional arguments"
60
+ task :copy_data, [] => :dsl do
61
+ worker = DTWorker.new File::expand_path('../../rule.rb', __FILE__)
62
+ worker.do_work
63
+ end
64
+
65
+ #task :my, [:arg1, :arg2, :arg3, :arg4] => :environment do |t, args|
66
+ #print t, args
67
+ #tables = DataTransit::Database.tables
68
+ #print tables
69
+ #end
70
+
71
+ end
metadata ADDED
@@ -0,0 +1,66 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: data_transit
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.2.0
5
+ prerelease:
6
+ platform: ruby
7
+ authors:
8
+ - thundercumt
9
+ autorequire:
10
+ bindir: bin
11
+ cert_chain: []
12
+ date: 2015-03-12 00:00:00.000000000 Z
13
+ dependencies: []
14
+ description: data_transit relies on activerecord to generate database Models on the
15
+ fly. Tt is executed within a database transaction, and should any error occur during
16
+ data transit, it will cause the transaction to rollback. So don't worry about introducing
17
+ dirty data into your target database
18
+ email: thundercumt@126.com
19
+ executables:
20
+ - data_transit
21
+ extensions: []
22
+ extra_rdoc_files:
23
+ - README
24
+ - LICENSE
25
+ files:
26
+ - LICENSE
27
+ - README
28
+ - Rakefile
29
+ - Database.yml
30
+ - bin/data_transit
31
+ - bin/data_transit.bat
32
+ - lib/datatransit/clean_find_in_batches.rb
33
+ - lib/datatransit/cli.rb
34
+ - lib/datatransit/database.rb
35
+ - lib/datatransit/helper.rb
36
+ - lib/datatransit/model/tables_source.rb
37
+ - lib/datatransit/model/tables_target.rb
38
+ - lib/datatransit/model_dumper.rb
39
+ - lib/datatransit/rule_dsl.rb
40
+ - lib/datatransit/tasks.rb
41
+ homepage: https://github.com/thundercumt/data_transit
42
+ licenses: []
43
+ post_install_message:
44
+ rdoc_options: []
45
+ require_paths:
46
+ - lib
47
+ required_ruby_version: !ruby/object:Gem::Requirement
48
+ none: false
49
+ requirements:
50
+ - - ! '>='
51
+ - !ruby/object:Gem::Version
52
+ version: '0'
53
+ required_rubygems_version: !ruby/object:Gem::Requirement
54
+ none: false
55
+ requirements:
56
+ - - ! '>='
57
+ - !ruby/object:Gem::Version
58
+ version: '0'
59
+ requirements: []
60
+ rubyforge_project:
61
+ rubygems_version: 1.8.28
62
+ signing_key:
63
+ specification_version: 3
64
+ summary: a ruby gem/app used to migrate between databases, supporting customized migration
65
+ procedure
66
+ test_files: []