xcopier 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 315ef998ff2ab861e209e027bc7abd3d59ce3f15cdbaf74e9f3d782db5750cb8
4
+ data.tar.gz: 251b908019418f61bbe4b3acd029b0bd247621c859bee85e586c7e92c55f5587
5
+ SHA512:
6
+ metadata.gz: ae63e3be2c29c82b13c9702c744d9cbd1c2e3ffca0c815cb95261d6f3863854c1cbf8c092b1b133b2073b2e848b09b78f69229ffc0634cff65ea326fa3ea8ba2
7
+ data.tar.gz: 59b587c6d6897b75cf8d5623a3255aa0ac91d06120194b1e9a9279484b0d73c87e66febb76e80556772e9ba516af8597eaee32799b22a4573de72345bb45135d
data/CHANGELOG.md ADDED
@@ -0,0 +1,5 @@
1
+ ## [Unreleased]
2
+
3
+ ## [1.0.0] - 2024-12-29
4
+
5
+ - Initial release
data/LICENSE.txt ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2024 Cristian Bica
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,144 @@
1
+ # Xcopier
2
+
3
+ Xcopier is a tool to copy data from one database to another. It is designed to be used in a development environment to copy data from a production database to a local database (e.g., to test a data migration or data fix) allowing you to override and/or anonymize the data.
4
+
5
+ :warning: :warning: :warning:
6
+
7
+ This is a "sharp knife" tool. It can be used to copy data from one database to another. Make sure you properly set the source and destination connections to avoid data loss or corruption.
8
+
9
+ ## Installation
10
+
11
+ Install the gem and add it to the application's Gemfile by executing:
12
+
13
+ ```bash
14
+ bundle add xcopier --group=development
15
+ ```
16
+
17
+ ## Usage
18
+
19
+ Create a file (e.g., `app/libs/company_copier.rb`) and define a class that includes `Xcopier::DSL`.
20
+
21
+ You could also use the generator provided by this gem:
22
+
23
+ ```bash
24
+ bundle exec rails generate xcopier:copier company
25
+ ```
26
+
27
+ ```ruby
28
+ class CompanyCopier
29
+ include Xcopier::DSL
30
+
31
+ # you can use here a symbol to reference a connection defined in database.yml
32
+ # or a hash with connection details
33
+ # or a string with a connection url
34
+ source :production
35
+ destination :development
36
+
37
+ argument :company_ids, :integer, list: true
38
+
39
+ copy :companies, scope: -> { Company.where(id: arguments[:company_ids]) }
40
+ copy :users, scope: -> { User.where(company_id: arguments[:company_ids]) }, chunk_size: 100
41
+ end
42
+ ```
43
+
44
+ Then run the copier:
45
+
46
+ ```bash
47
+ bundle exec xcopier company --company-ids 1,2
48
+ ```
49
+
50
+ The above will load your app, instantiate the `CompanyCopier` class, and run the `copy` method for the `companies` and `users` tables.
51
+
52
+ You could also do this from a Rails console:
53
+
54
+ ```ruby
55
+ CompanyCopier.new(company_ids: [1, 2]).run
56
+ # or give the argument as a string and it will be parsed
57
+ CompanyCopier.new(company_ids: "1,2").run
58
+ ```
59
+
60
+ ### Arguments
61
+
62
+ The DSL includes an `argument` directive. Its purpose is to provide copier arguments to be used in queries to copy data. It supports typecasting for the following types: string, integer, time, date, boolean. You can also specify if the argument is a list by setting the `list` option to `true`.
63
+
64
+ Example:
65
+
66
+ ```ruby
67
+ argument :str, :string
68
+ argument :str_list, :string, list: true
69
+ argument :int, :integer
70
+ argument :int_list, :integer, list: true
71
+ argument :time, :time # it will parse the time using Time.parse
72
+ argument :date, :date # it will parse the date using Date.parse
73
+ argument :bool, :boolean # it will recognize as truthy the values: "1", "yes", "true", true
74
+
75
+ copier.new(str: "string", str_list: "string1,string2", int: "1", int_list: "1,2", time: "2020-01-01 12:00", date: "2020-01-01", bool: "true")
76
+
77
+ copier.arguments[:str] # => "string"
78
+ copier.arguments[:str_list] # => ["string1", "string2"]
79
+ copier.arguments[:int] # => 1
80
+ copier.arguments[:int_list] # => [1, 2]
81
+ copier.arguments[:time] # => Time.parse("2020-01-01 12:00")
82
+ copier.arguments[:date] # => Date.parse("2020-01-01")
83
+ copier.arguments[:bool] # => true
84
+ ```
85
+
86
+ Example:
87
+
88
+ ```ruby
89
+ copy :companies, anonymize: true
90
+
91
+ copy :users,
92
+ model: User, # this is not actually needed, it will be inferred from the name
93
+ scope: -> { User.all }, # this is also not needed as .all on the model is the default
94
+ chunk_size: 500, # this is the default value
95
+ overrides: {
96
+ email: ->(email) { email.gsub(/@/, "+#{SecureRandom.hex(4)}@") },
97
+ password: "password",
98
+ last_login_at: -> (last_login_at, attributes) { attributes[:created_at] + 1.minute }
99
+ },
100
+ anonymize: %w(first_name last_name street_address)
101
+ ```
102
+
103
+ ### Copy Operations
104
+
105
+ The `copy` directive is to instruct the copier what to copy. It accepts the following options:
106
+ - `name`
107
+ - a name for the copy operation (usually the table name)
108
+ - will be used to determine the model if not given
109
+ - `model`
110
+ - the model to use for the copy operation
111
+ - if not given it will be inferred from the `name`
112
+ - `scope`
113
+ - a lambda that returns the records to be copied
114
+ - if not given it will copy all records using the `model.all`
115
+ - `chunk_size`
116
+ - the number of records to copy at once
117
+ - default value is `500`
118
+ - `overrides`
119
+ - rules to transform the data before writing
120
+ - it is a hash where the key is the column name and the replacement is the value
121
+ - the value can be a lambda that returns the new value
122
+ - the lambda can receive no arguments or a single argument with the original value or two arguments with the original value and a hash of the record
123
+ - `anonymize`
124
+ - where to try to anonymize the data
125
+ - can be `true` to anonymize all columns or a list of columns to anonymize
126
+ - more on the anonymization in the next section
127
+ - anonymization is not done for columns that have an override
128
+ - anonymization is done in the [`Xcopier::Anonymizer`](https://github.com/cristianbica/xcopier/blob/master/lib/xcopier/anonymizer.rb) class, is based on the column name and uses the [faker](https://rubygems.org/gems/faker) gem
129
+ - :warning: anonymization is not guaranteed to be secure and has currently a limited implementation
130
+ - feel free to adjust it in your app (`Xcopier::Anonymizer::RULES` is a mutable hash where the key is a regex to match the column and the value is a lambda that returns the anonymized value) or contribute to this gem
131
+
132
+ ## Development
133
+
134
+ After checking out the repo, run `bin/setup` to install dependencies. Then, run `bundle exec rake test` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
135
+
136
+ To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and the created tag, and push the `.gem` file to [rubygems.org](https://rubygems.org).
137
+
138
+ ## Contributing
139
+
140
+ Bug reports and pull requests are welcome on GitHub at https://github.com/cristianbica/xcopier.
141
+
142
+ ## License
143
+
144
+ The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
data/Rakefile ADDED
@@ -0,0 +1,17 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "bundler/gem_tasks"
4
+ require "minitest/test_task"
5
+
6
+ namespace :dummy do
7
+ require_relative "test/dummy/config/application"
8
+ Dummy::Application.load_tasks
9
+ end
10
+
11
+ Minitest::TestTask.create
12
+
13
+ require "rubocop/rake_task"
14
+
15
+ RuboCop::RakeTask.new
16
+
17
+ task default: %i[test rubocop]
data/exe/xcopier ADDED
@@ -0,0 +1,12 @@
1
+ #!/usr/bin/env ruby
2
+ # frozen_string_literal: true
3
+
4
+ unless File.exist?("config/environment.rb")
5
+ warn "Expected to be ran from the root of a Rails project"
6
+ exit 1
7
+ end
8
+
9
+ require File.expand_path("config/environment")
10
+ require "xcopier/cli"
11
+
12
+ Xcopier::CLI.start(ARGV)
@@ -0,0 +1,20 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "rails/generators"
4
+
5
+ module Xcopier
6
+ module Generators
7
+ class CopierGenerator < ::Rails::Generators::NamedBase
8
+ source_root File.expand_path("templates", __dir__)
9
+
10
+ def create_copier
11
+ template "copiator.rb.tt", File.join("app/libs", class_path, "#{file_name}_copier.rb")
12
+ end
13
+
14
+ def models
15
+ Rails.application.eager_load! unless Rails.application.config.eager_load
16
+ ApplicationRecord.subclasses
17
+ end
18
+ end
19
+ end
20
+ end
@@ -0,0 +1,21 @@
1
+ # frozen_string_literal: true
2
+
3
+ <% module_namespacing do -%>
4
+ class <%= class_name %>Copier
5
+ include Xcopier::DSL
6
+
7
+ argument :tenant_ids, type: :integer, list: true
8
+ <% models.each do |model| -%>
9
+
10
+ copy :<%= model.table_name %>,
11
+ model: -> { <%= model.name %> },
12
+ scope: -> { <%= model.name %>.where(tenant_id: arguments[:tenant_ids]) },
13
+ overrides: {
14
+ # field: "value"
15
+ # field: -> { "value" }
16
+ # field: ->(value) { value.upcase }
17
+ # field: ->(value, attributes) { attributes[:other_field] == "value" ? "value" : value }
18
+ }
19
+ <% end -%>
20
+ end
21
+ <% end -%>
@@ -0,0 +1,64 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "active_support/core_ext/module/delegation"
4
+
5
+ module Xcopier
6
+ class Actor
7
+ class UnknownMessageError < StandardError; end
8
+
9
+ attr_reader :copier
10
+ attr_accessor :__queue, :thread, :parent, :result
11
+
12
+ delegate :logger, to: :copier
13
+ delegate :log, :info, :debug, :error, to: :logger, allow_nil: true
14
+
15
+ def self.spawn!(*args)
16
+ actor = new(*args)
17
+ actor.__queue = Thread::Queue.new
18
+ actor.thread = Thread.new do
19
+ Thread.current[:xcopier_actor] = actor
20
+ actor.__work__
21
+ end
22
+ actor.thread.name = name.demodulize.underscore
23
+ actor.thread.report_on_exception = false
24
+ actor
25
+ end
26
+
27
+ def initialize(copier)
28
+ @copier = copier
29
+ super()
30
+ end
31
+
32
+ def wait
33
+ thread.value
34
+ end
35
+
36
+ def __work__
37
+ while (message = __queue.pop)
38
+ begin
39
+ return result if message == :__terminate
40
+
41
+ on_message(message)
42
+ rescue Exception => e # rubocop:disable Lint/RescueException
43
+ on_error(e)
44
+ end
45
+ end
46
+ end
47
+
48
+ def tell(message)
49
+ __queue.push(message)
50
+ end
51
+
52
+ def on_message(message)
53
+ raise NotImplementedError
54
+ end
55
+
56
+ def terminate!
57
+ debug "#{self.class.name.demodulize}: terminating"
58
+ __queue.clear
59
+ __queue.push(:__terminate)
60
+ end
61
+
62
+ def on_error(error); end
63
+ end
64
+ end
@@ -0,0 +1,36 @@
1
+ # frozen_string_literal: true
2
+
3
+ begin
4
+ require "faker"
5
+ rescue LoadError # rubocop:disable Lint/SuppressedException
6
+ end
7
+
8
+ module Xcopier
9
+ module Anonymizer
10
+ module_function
11
+
12
+ RULES = { # rubocop:disable Style/MutableConstant
13
+ /email/ => -> { Faker::Internet.email },
14
+ /first_?name/ => -> { Faker::Name.first_name },
15
+ /last_?name/ => -> { Faker::Name.last_name },
16
+ /name/ => -> { Faker::Name.name },
17
+ /phone/ => -> { Faker::PhoneNumber.phone_number },
18
+ /address/ => -> { Faker::Address.full_address },
19
+ /city/ => -> { Faker::Address.city },
20
+ /country/ => -> { Faker::Address.country },
21
+ /zip/ => -> { Faker::Address.zip_code },
22
+ /(company|organization)/ => -> { Faker::Company.name }
23
+ }
24
+
25
+ def anonymize(name, value)
26
+ unless defined?(Faker)
27
+ puts "[WARN] Please add the faker gem to your Gemfile to be able to automatic annonimize data"
28
+ return value
29
+ end
30
+ RULES.each do |rule, block|
31
+ return block.call if name.match?(rule)
32
+ end
33
+ value
34
+ end
35
+ end
36
+ end
@@ -0,0 +1,73 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "optparse"
4
+ require "active_support/core_ext/string/inflections"
5
+
6
+ module Xcopier
7
+ class CLI
8
+ def self.start(args)
9
+ new(args).run
10
+ end
11
+
12
+ def initialize(args)
13
+ @args = args
14
+ end
15
+
16
+ def run
17
+ valid = parse
18
+ return unless valid
19
+
20
+ @options[:copier].new(**@options[:args]).run
21
+ end
22
+
23
+ def parse # rubocop:disable Metrics/CyclomaticComplexity,Metrics/AbcSize,Metrics/PerceivedComplexity,Metrics/MethodLength
24
+ @options = { copier_name: @args.shift, args: {} }
25
+ @options[:copier] = @options[:copier_name].classify.safe_constantize
26
+
27
+ parser = OptionParser.new do |opts|
28
+ opts.banner = "Usage: xcopier copier_name [--arg1=value1 ...] [options]"
29
+
30
+ opts.separator "\nCOPIER ARGUMENTS" if @options[:copier]&._arguments&.any?
31
+ @options[:copier]&._arguments&.each do |arg|
32
+ @options[:args][arg[:name]] = nil
33
+ banner = "--#{arg[:name].to_s.gsub("_", "-").parameterize}"
34
+ banner += "=VALUE" if arg[:type] != :boolean
35
+ help = "#{arg[:name]} copier argument"
36
+ help << " (comma separated list)" if arg[:list]
37
+ opts.on(banner, help) do |value|
38
+ @options[:args][arg[:name]] = value
39
+ end
40
+ end
41
+
42
+ opts.separator "\nOPTIONS"
43
+ opts.on("-h", "--help", "Show help message") do
44
+ @options[:help] = true
45
+ end
46
+
47
+ opts.on("-v", "--verbose", "Run verbosely") do
48
+ @options[:verbose] = true
49
+ end
50
+ end
51
+ parser.parse(@args)
52
+
53
+ if @options[:help]
54
+ puts parser
55
+ return false
56
+ end
57
+
58
+ if @options[:copier].nil?
59
+ puts "ERROR: Copier not found\n\n"
60
+ puts parser
61
+ return false
62
+ end
63
+
64
+ if @options[:args].values.any?(&:nil?)
65
+ puts "ERROR: Missing argument values\n\n"
66
+ puts parser
67
+ return false
68
+ end
69
+
70
+ true
71
+ end
72
+ end
73
+ end
@@ -0,0 +1,99 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "active_support/concern"
4
+ require "active_support/core_ext/class/attribute"
5
+ require "active_support/core_ext/enumerable"
6
+ require "xcopier/version"
7
+ require_relative "operation"
8
+ require_relative "runner"
9
+
10
+ module Xcopier
11
+ module DSL
12
+ extend ActiveSupport::Concern
13
+
14
+ BOOLS = ["1", "yes", "true", true].freeze
15
+
16
+ included do
17
+ attr_reader :arguments, :logger
18
+
19
+ class_attribute :source, default: nil
20
+ class_attribute :destination, default: :development
21
+ class_attribute :_arguments, instance_accessor: false, instance_predicate: false, default: []
22
+ class_attribute :_operations, instance_accessor: false, instance_predicate: false, default: []
23
+ end
24
+
25
+ class_methods do
26
+ def copy(name, **options)
27
+ _operations << { name: name, **options }
28
+ end
29
+
30
+ def argument(name, type = :string, **options)
31
+ _arguments << { name: name.to_sym, type: type, **options }
32
+ end
33
+ end
34
+
35
+ def initialize(**args)
36
+ validate_arguments(args)
37
+ parse_arguments(args)
38
+ end
39
+
40
+ def operations
41
+ @operations ||= self.class._operations.map { |operation| Operation.new(self, **operation) }
42
+ end
43
+
44
+ def run
45
+ setup
46
+ Runner.run(self)
47
+ ensure
48
+ teardown
49
+ end
50
+
51
+ private
52
+
53
+ def setup
54
+ ApplicationRecord.connects_to(
55
+ shards: {
56
+ xcopier: { reading: source, writing: destination }
57
+ }
58
+ )
59
+ @logger = Xcopier.logger
60
+ end
61
+
62
+ def teardown; end
63
+
64
+ def validate_arguments(args)
65
+ given_arguments = args.keys
66
+ expected_arguments = self.class._arguments.pluck(:name)
67
+
68
+ missing_arguments = expected_arguments - given_arguments
69
+ raise ArgumentError, "Missing arguments: #{missing_arguments}" if missing_arguments.any?
70
+
71
+ unknown_arguments = given_arguments - expected_arguments
72
+ raise ArgumentError, "Unknown arguments: #{unknown_arguments}" if unknown_arguments.any?
73
+ end
74
+
75
+ def parse_arguments(args)
76
+ @arguments = self.class._arguments.each_with_object({}) do |definition, hash|
77
+ name = definition[:name]
78
+
79
+ if definition[:list]
80
+ values = args[name].split(",").map(&:strip).compact
81
+ hash[name] = values.map { |v| typecast_argument(v, definition) }
82
+ else
83
+ hash[name] = typecast_argument(args[name], definition)
84
+ end
85
+ end
86
+ end
87
+
88
+ def typecast_argument(value, definition)
89
+ type = definition[:type]
90
+ return value if type == :string
91
+ return value.to_i if type == :integer
92
+ return Time.parse(value) if type == :time
93
+ return Date.parse(value) if type == :date
94
+ return Xcopier::DSL::BOOLS.include?(value) if type == :boolean
95
+
96
+ raise ArgumentError, "Unknown argument type: #{type}"
97
+ end
98
+ end
99
+ end
@@ -0,0 +1,32 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "active_support/core_ext/hash/indifferent_access"
4
+
5
+ module Xcopier
6
+ class Operation
7
+ attr_reader :name, :model, :scope, :chunk_size, :overrides, :anonymize
8
+
9
+ def initialize(copier, name:, model: nil, scope: nil, chunk_size: 500, overrides: {}, anonymize: []) # rubocop:disable Metrics/ParameterLists
10
+ @name = name
11
+ @model = model
12
+ @scope = scope
13
+ @chunk_size = chunk_size
14
+ @overrides = overrides.is_a?(Hash) ? overrides.with_indifferent_access : overrides
15
+ @anonymize = anonymize.is_a?(Array) ? anonymize.map(&:to_s) : anonymize
16
+ prepare_model_and_scope(copier)
17
+ end
18
+
19
+ def inspect
20
+ "#<#{self.class.name} name: #{name}, model: #{model.name}, chunk_size: #{chunk_size}, overrides: #{overrides.inspect}, anonymize: #{anonymize.inspect}>"
21
+ end
22
+
23
+ private
24
+
25
+ def prepare_model_and_scope(copier)
26
+ @model = name.to_s.classify.constantize if model.nil?
27
+ @model = model.call if model.is_a?(Proc)
28
+ @scope = copier.instance_exec(&scope) if scope.is_a?(Proc)
29
+ @scope = model.all if scope.nil?
30
+ end
31
+ end
32
+ end
@@ -0,0 +1,6 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Xcopier
4
+ class Railtie < Rails::Railtie
5
+ end
6
+ end
@@ -0,0 +1,62 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative "actor"
4
+
5
+ module Xcopier
6
+ class Reader < Actor
7
+ attr_reader :queue, :operation
8
+
9
+ def initialize(queue, *rest)
10
+ @queue = queue
11
+ super(*rest)
12
+ end
13
+
14
+ def on_message(message)
15
+ case message
16
+ in [:read, Operation => operation]
17
+ debug "Reader#message: type=read operation=#{operation.inspect}"
18
+ process(operation)
19
+ else
20
+ debug "Reader#message: type=unknown message=#{message.inspect}"
21
+ raise UnknownMessageError, "Unknown message: #{message.inspect}"
22
+ end
23
+ end
24
+
25
+ def on_error(error)
26
+ debug "Reader#error: #{error.message}"
27
+ parent.tell([:error, error])
28
+ end
29
+
30
+ def process(operation)
31
+ setup(operation)
32
+ work
33
+ ensure
34
+ teardown
35
+ end
36
+
37
+ def work
38
+ each_chunk do |chunk|
39
+ queue.push(chunk)
40
+ debug "Reader: read and pushed #{chunk.size} records"
41
+ end
42
+ debug "Reader: done"
43
+ queue.push(:done)
44
+ end
45
+
46
+ def each_chunk
47
+ ApplicationRecord.connected_to(shard: :xcopier, role: :reading) do
48
+ operation.scope.in_batches(of: operation.chunk_size) do |relation|
49
+ yield operation.model.connection.exec_query(relation.to_sql).to_a
50
+ end
51
+ end
52
+ end
53
+
54
+ def setup(operation)
55
+ @operation = operation
56
+ end
57
+
58
+ def teardown
59
+ @operation = nil
60
+ end
61
+ end
62
+ end
@@ -0,0 +1,81 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative "actor"
4
+ require_relative "reader"
5
+ require_relative "transformer"
6
+ require_relative "writer"
7
+
8
+ module Xcopier
9
+ class Runner < Actor
10
+ attr_reader :source_queue, :destination_queue, :copier, :reader, :transformer, :writer, :promise
11
+ attr_accessor :index
12
+
13
+ def self.run(copier)
14
+ runner = spawn!(copier)
15
+ runner.tell(:run)
16
+ ret = runner.wait
17
+ raise ret if ret.is_a?(Exception)
18
+
19
+ ret
20
+ end
21
+
22
+ def initialize(copier)
23
+ @source_queue = Queue.new
24
+ @destination_queue = Queue.new
25
+ @reader = Reader.spawn!(source_queue, copier).tap { |actor| actor.parent = self }
26
+ @transformer = Transformer.spawn!(source_queue, destination_queue, copier).tap { |actor| actor.parent = self }
27
+ @writer = Writer.spawn!(destination_queue, copier).tap { |actor| actor.parent = self }
28
+ @index = -1
29
+ super
30
+ end
31
+
32
+ def on_message(message)
33
+ case message
34
+ in :run
35
+ debug "Runner#message: type=run"
36
+ process
37
+ in :done
38
+ process
39
+ in [:error, e]
40
+ debug "Runner#message: type=error error=#{e.message}"
41
+ finish(e)
42
+ else
43
+ debug "Runner#message: type=unknown message=#{message.inspect}"
44
+ raise UnknownMessageError, "Unknown message: #{message.inspect}"
45
+ end
46
+ end
47
+
48
+ def on_error(error)
49
+ debug "Runner#error: #{error.message}"
50
+ finish(error)
51
+ end
52
+
53
+ def process
54
+ self.index += 1
55
+ if current_operation.nil?
56
+ finish(true)
57
+ else
58
+ reader.tell([:read, current_operation])
59
+ transformer.tell([:transform, current_operation])
60
+ writer.tell([:write, current_operation])
61
+ end
62
+ end
63
+
64
+ def finish(message = nil)
65
+ debug "Runner#finish: message=#{message.inspect}"
66
+ self.result = message
67
+
68
+ source_queue.push(:done)
69
+ destination_queue.push(:done)
70
+
71
+ reader.terminate!
72
+ transformer.terminate!
73
+ writer.terminate!
74
+ terminate!
75
+ end
76
+
77
+ def current_operation
78
+ copier.operations[index]
79
+ end
80
+ end
81
+ end
@@ -0,0 +1,96 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative "actor"
4
+ require_relative "anonymizer"
5
+
6
+ module Xcopier
7
+ class Transformer < Actor
8
+ attr_reader :input, :output, :operation
9
+
10
+ def initialize(input, output, *rest)
11
+ @input = input
12
+ @output = output
13
+ super(*rest)
14
+ end
15
+
16
+ def on_message(message)
17
+ case message
18
+ in [:transform, Operation => operation]
19
+ debug "Transformer#message: type=transform operation=#{operation.inspect}"
20
+ process(operation)
21
+ else
22
+ debug "Transformer#message: type=unknown message=#{message.inspect}"
23
+ raise UnknownMessageError, "Unknown message: #{message.inspect}"
24
+ end
25
+ end
26
+
27
+ def on_error(error)
28
+ debug "Transformer#error: #{error.message}"
29
+ parent.tell([:error, error])
30
+ end
31
+
32
+ def process(operation)
33
+ setup(operation)
34
+ work
35
+ ensure
36
+ teardown
37
+ end
38
+
39
+ def work
40
+ loop do
41
+ chunk = input.pop
42
+
43
+ if chunk == :done
44
+ debug "Transformer: done"
45
+ output.push(:done)
46
+ @operation = nil
47
+ break
48
+ end
49
+ debug "Transformer: transforming #{chunk.size} records"
50
+ transformed = transform(chunk)
51
+ output.push(transformed)
52
+ end
53
+ end
54
+
55
+ def transform(chunk)
56
+ chunk.map do |record|
57
+ transform_record(record)
58
+ end
59
+ end
60
+
61
+ def transform_record(record)
62
+ record.each_with_object({}) do |(key, value), hash|
63
+ value = apply_overrides(value, key, record)
64
+ value = apply_anonymization(value, key)
65
+ hash[key] = value
66
+ end
67
+ end
68
+
69
+ def setup(operation)
70
+ @operation = operation
71
+ end
72
+
73
+ def teardown
74
+ @operation = nil
75
+ end
76
+
77
+ private
78
+
79
+ def apply_overrides(value, key, record)
80
+ return value unless operation.overrides.key?(key)
81
+
82
+ new_value = operation.overrides[key]
83
+ return new_value unless new_value.respond_to?(:call)
84
+
85
+ new_value.call(*[value, record].first(new_value.arity))
86
+ end
87
+
88
+ def apply_anonymization(value, key)
89
+ return value if operation.overrides.key?(key)
90
+ return Anonymizer.anonymize(key, value) if operation.anonymize == true
91
+ return Anonymizer.anonymize(key, value) if operation.anonymize.is_a?(Array) && operation.anonymize.include?(key)
92
+
93
+ value
94
+ end
95
+ end
96
+ end
@@ -0,0 +1,5 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Xcopier
4
+ VERSION = "1.0.0"
5
+ end
@@ -0,0 +1,65 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative "actor"
4
+
5
+ module Xcopier
6
+ class Writer < Actor
7
+ attr_reader :queue, :operation
8
+
9
+ def initialize(queue, *rest)
10
+ @queue = queue
11
+ super(*rest)
12
+ end
13
+
14
+ def on_message(message)
15
+ case message
16
+ in [:write, Operation => operation]
17
+ debug "Writer#message: type=write operation=#{operation.inspect}"
18
+ process(operation)
19
+ else
20
+ debug "Writer#message: type=unknown message=#{message.inspect}"
21
+ raise UnknownMessageError, "Unknown message: #{message.inspect}"
22
+ end
23
+ end
24
+
25
+ def on_error(error)
26
+ debug "Writer#error: #{error.message}"
27
+ parent.tell([:error, error])
28
+ end
29
+
30
+ def process(operation)
31
+ setup(operation)
32
+ work
33
+ ensure
34
+ teardown
35
+ end
36
+
37
+ def work
38
+ loop do
39
+ chunk = queue.pop
40
+ if chunk == :done
41
+ debug "Writer: done"
42
+ @operation = nil
43
+ parent.tell(:done)
44
+ break
45
+ end
46
+ write(chunk)
47
+ end
48
+ end
49
+
50
+ def write(records)
51
+ ApplicationRecord.connected_to(shard: :xcopier, role: :writing) do
52
+ debug "Writer: writing #{records.size} records"
53
+ operation.model.insert_all(records)
54
+ end
55
+ end
56
+
57
+ def setup(operation)
58
+ @operation = operation
59
+ end
60
+
61
+ def teardown
62
+ @operation = nil
63
+ end
64
+ end
65
+ end
data/lib/xcopier.rb ADDED
@@ -0,0 +1,14 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative "xcopier/version"
4
+ require_relative "xcopier/dsl"
5
+ require_relative "xcopier/railtie"
6
+
7
+ require "active_support/core_ext/module/attribute_accessors"
8
+ require "logger"
9
+
10
+ module Xcopier
11
+ mattr_accessor :logger do
12
+ @logger ||= Logger.new($stdout, level: ENV.fetch("LOG_LEVEL", :info))
13
+ end
14
+ end
metadata ADDED
@@ -0,0 +1,113 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: xcopier
3
+ version: !ruby/object:Gem::Version
4
+ version: 1.0.0
5
+ platform: ruby
6
+ authors:
7
+ - Cristian Bică
8
+ autorequire:
9
+ bindir: exe
10
+ cert_chain: []
11
+ date: 2024-12-29 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: activerecord
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - ">="
18
+ - !ruby/object:Gem::Version
19
+ version: '7.0'
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - ">="
25
+ - !ruby/object:Gem::Version
26
+ version: '7.0'
27
+ - !ruby/object:Gem::Dependency
28
+ name: activesupport
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - ">="
32
+ - !ruby/object:Gem::Version
33
+ version: '7.0'
34
+ type: :runtime
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - ">="
39
+ - !ruby/object:Gem::Version
40
+ version: '7.0'
41
+ - !ruby/object:Gem::Dependency
42
+ name: appraisal
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - ">="
46
+ - !ruby/object:Gem::Version
47
+ version: '0'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - ">="
53
+ - !ruby/object:Gem::Version
54
+ version: '0'
55
+ description: Xcopier is a tool to copy data from one database to another. It is designed
56
+ to be used in a development environment to copy data from a production database
57
+ to a local database (e.g., to test a data migration or data fix) allowing you to
58
+ override and/or anonymize the data.
59
+ email:
60
+ - cristian.bica@gmail.com
61
+ executables:
62
+ - xcopier
63
+ extensions: []
64
+ extra_rdoc_files: []
65
+ files:
66
+ - CHANGELOG.md
67
+ - LICENSE.txt
68
+ - README.md
69
+ - Rakefile
70
+ - exe/xcopier
71
+ - lib/generators/xcopier/copier/copier_generator.rb
72
+ - lib/generators/xcopier/copier/templates/copiator.rb.tt
73
+ - lib/xcopier.rb
74
+ - lib/xcopier/actor.rb
75
+ - lib/xcopier/anonymizer.rb
76
+ - lib/xcopier/cli.rb
77
+ - lib/xcopier/dsl.rb
78
+ - lib/xcopier/operation.rb
79
+ - lib/xcopier/railtie.rb
80
+ - lib/xcopier/reader.rb
81
+ - lib/xcopier/runner.rb
82
+ - lib/xcopier/transformer.rb
83
+ - lib/xcopier/version.rb
84
+ - lib/xcopier/writer.rb
85
+ homepage: https://github.com/cristianbica/xcopier
86
+ licenses:
87
+ - MIT
88
+ metadata:
89
+ allowed_push_host: https://rubygems.org
90
+ homepage_uri: https://github.com/cristianbica/xcopier
91
+ source_code_uri: https://github.com/cristianbica/xcopier
92
+ changelog_uri: https://github.com/cristianbica/xcopier/blob/master/CHANGELOG.md
93
+ rubygems_mfa_required: 'true'
94
+ post_install_message:
95
+ rdoc_options: []
96
+ require_paths:
97
+ - lib
98
+ required_ruby_version: !ruby/object:Gem::Requirement
99
+ requirements:
100
+ - - ">="
101
+ - !ruby/object:Gem::Version
102
+ version: 3.0.0
103
+ required_rubygems_version: !ruby/object:Gem::Requirement
104
+ requirements:
105
+ - - ">="
106
+ - !ruby/object:Gem::Version
107
+ version: '0'
108
+ requirements: []
109
+ rubygems_version: 3.5.22
110
+ signing_key:
111
+ specification_version: 4
112
+ summary: Xcopier is a tool to copy data from one database to another.
113
+ test_files: []