exwiw 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 0aa1977437fc4e44349ecc11431e4f8697acb1471e159777c629850a52de1664
4
+ data.tar.gz: cd92c7a05f6958d2f0cc5ee1204b30fe7b6bd6e168aa18c0b2b182da3b00af9a
5
+ SHA512:
6
+ metadata.gz: e979a144ac442f73483c23c93cb335742b43e9d198811788edb9a416a3d70cd3b23b64ab0783cf20b4452afa00d46827aaed629d021d880172d373491cc9ec96
7
+ data.tar.gz: 35375bb916081981ff264dd74a53b59638c78ad0e2f4dfb40d8716bec59caa387dc114001234be8bb2ba20a3345ecac9d88a1b7b0322ae38b8d92759c61efa75
data/CHANGELOG.md ADDED
@@ -0,0 +1,5 @@
1
+ ## [Unreleased]
2
+
3
+ ## [0.1.0] - 2025-01-31
4
+
5
+ - Initial release
data/LICENSE.txt ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2025 Shia
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,183 @@
1
+ # Exwiw
2
+
3
+ Export What I Want (Exwiw) is a Ruby gem that allows you to export records from a database to a dump file(to specifically, the full list of INSERT sql) on the specified conditions.
4
+
5
+ ## When to use
6
+
7
+ Most of case in developing a software, There is no better choice than the same data in production.
8
+ You might make well-crafted data, but it's very very hard to maintain.
9
+
10
+ If you find the way to maintain the data for develoment env, then exwiw might be a solution for that.
11
+
12
+ - Export the full database and mask data and import to another database.
13
+ - Setup some system to replicate and mask data in real-time to another database.
14
+
15
+
16
+ You want to export only the data you want to export.
17
+
18
+ ## Features
19
+
20
+ - Export the full list of INSERT sql for the specified conditions.
21
+ - Provide serveral masking options for sensitive columns.
22
+ - Provide config generator for ActiveRecord.
23
+
24
+ ## Installation
25
+
26
+ ```bash
27
+ bundle add exwiw
28
+ ```
29
+
30
+ Most of cases, you want to add 'require: false' to the Gemfile.
31
+
32
+ If bundler is not being used to manage dependencies, install the gem by executing:
33
+
34
+ ```bash
35
+ gem install exwiw
36
+ ```
37
+
38
+ ## Supported Databases
39
+
40
+ - mysql2
41
+ - postgresql
42
+ - sqlite3
43
+
44
+ ## Usage
45
+
46
+ ### Command
47
+
48
+ ```bash
49
+ # dump & masking all records from database to dump.sql based on schema.json
50
+ # pass database password as an environment variable 'DATABASE_PASSWORD'
51
+ exwiw \
52
+ --adapter=mysql2 \
53
+ --host=localhost \
54
+ --port=3306 \
55
+ --user=reader \
56
+ --database=app_production \
57
+ --config-dir=exwiw \
58
+ --target-table=shops \
59
+ --ids=1 \ # comma separated ids
60
+ --output-dir=dump \
61
+ --log-level=info
62
+ ```
63
+
64
+ This command will generate sql files in the `dump` directory.
65
+
66
+ - `dump/insert-{idx}-{table_name}.sql`
67
+ - `dump/delete-{idx}-{table_name}.sql`
68
+
69
+ idx means the order of the dump. bigger idx might depend on smaller idx,
70
+ so you should import the dump in order.
71
+
72
+ you need to delete the records before importing the dump,
73
+ `delete-{idx}-{table_name}.sql` will help you to do that.
74
+ This sql will delete "all" related records to the extract targets.
75
+ idx meaning is the same as insert sql.
76
+
77
+ ### Generator
78
+
79
+ the config generator is provided as Rake task.
80
+
81
+ ```bash
82
+ # generate table schema under exwiw/
83
+ bundle exec rake exwiw:schema:generate
84
+ ```
85
+
86
+ ### Configuration
87
+
88
+ This is an example of the one table schema:
89
+
90
+ ```json
91
+ {
92
+ "name": "users",
93
+ "primary_key": "id",
94
+ "filter": "users.id > 0",
95
+ "belongs_to": [{
96
+ "name": "companies",
97
+ "foreign_key": "company_id"
98
+ }],
99
+ "columns": [{
100
+ "name": "id",
101
+ }, {
102
+ "name": "email",
103
+ "replace_with": "user{id}@example.com"
104
+ }, {
105
+ "name": "company_id"
106
+ }]
107
+ }
108
+ ```
109
+
110
+ `--config-dir` will use all json files in the specified directory.
111
+
112
+ ### Filter
113
+
114
+ Some case, you don't need full records related to target. e.g. dump user access logs only for the last year.
115
+ `filter` is here for that. Be careful to use this option, as it will be:
116
+
117
+ - injected as it is in table condition(e.g. WHERE on mysql), so you are recommended to clearify table name of column to avoid ambiguity.
118
+ - injected to every where / join clause, so it affects to all tables depends on filterted target-table. it results to data inconsistency.
119
+
120
+ ### Masking
121
+
122
+ `exwiw` provides several options for masking value.
123
+
124
+ #### `replace_with`
125
+
126
+ It will replace the value with the specified string,
127
+ and you can use the column name with `{}` to replace the value with the column value.
128
+
129
+ For example, Let assume we have the record which id is 1,
130
+ then "user{id}@example.com" will be replaced with "user1@example.com".
131
+
132
+ #### `raw_sql`
133
+
134
+ It will used instead of the original value.
135
+
136
+ For example, `"raw_sql": "CONCAT('user', shops.id, '@example.com')"` is equivalent to
137
+ `"replace_with": "user{id}@example.com"`.
138
+ This is useful when you want to transform with functions provided by the database.
139
+
140
+ Notice that you are recommended to clearify table name of column to avoid ambiguity.
141
+
142
+ If it used with `replace_with`, `replace_with` will be ignored.
143
+
144
+ #### `map`
145
+
146
+ XXX: TODO
147
+
148
+ Given value will be evaluated as Ruby code, and treated as the proc.
149
+
150
+ ```
151
+ "map": "proc { |r| 'user' + v['id'].to_s + '@example.com' }"
152
+ ```
153
+
154
+ which is equivalent to `"replace_with": "user{id}@example.com"`.
155
+
156
+ Notice this is the most powerful option, but you should be careful to use this option.
157
+ Because this transformation occured on exwiw process, so much slower than other options.
158
+ Most of case, this option is not recommended.
159
+
160
+ ## How it works
161
+
162
+ - Load the table information from the specified config file.
163
+ - Calculate the dependency between tables.
164
+ - Generate the full list of INSERT sql based on the specified conditions.
165
+ - If the processing table has no relation with target tables, then dump all records.
166
+ - If the processing table has relation with target tables, then dump the records which are related to the target tables.
167
+ - Generate the full list of DELETE sql based on the specified conditions.
168
+ - If the processing table has no relation with target tables, then delete all records.
169
+ - If the processing table has relation with target tables, then delete the records which are related to the target tables.
170
+
171
+ ## Development
172
+
173
+ After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
174
+
175
+ To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and the created tag, and push the `.gem` file to [rubygems.org](https://rubygems.org).
176
+
177
+ ## Contributing
178
+
179
+ Bug reports and pull requests are welcome on GitHub at https://github.com/riseshia/exwiw.
180
+
181
+ ## License
182
+
183
+ The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
data/exe/exwiw ADDED
@@ -0,0 +1,6 @@
1
+ #!/usr/bin/env ruby
2
+ # frozen_string_literal: true
3
+
4
+ require 'exwiw/cli'
5
+
6
+ Exwiw::CLI.start(ARGV)
@@ -0,0 +1,171 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Exwiw
4
+ module Adapter
5
+ class Mysql2Adapter < Base
6
+ def execute(query_ast)
7
+ sql = compile_ast(query_ast)
8
+
9
+ @logger.debug(" Executing SQL: \n#{sql}")
10
+ connection.query(sql, cast: false, as: :array).to_a
11
+ end
12
+
13
+ def to_bulk_insert(results, table)
14
+ table_name = table.name
15
+
16
+ value_list = results.map do |row|
17
+ quoted_values = row.map do |value|
18
+ escape_value(value)
19
+ end
20
+ "(" + quoted_values.join(', ') + ")"
21
+ end
22
+ values = value_list.join(",\n")
23
+
24
+ column_names = table.columns.map(&:name).join(', ')
25
+ "INSERT INTO #{table_name} (#{column_names}) VALUES\n#{values};"
26
+ end
27
+
28
+ def to_bulk_delete(select_query_ast, table)
29
+ raise NotImplementedError unless select_query_ast.is_a?(Exwiw::QueryAst::Select)
30
+
31
+ sql = "DELETE FROM #{select_query_ast.from_table_name}"
32
+
33
+ if select_query_ast.join_clauses.empty?
34
+ # Ignore filter option, because bulk delete is for cleaning before import,
35
+ # so it should delete all records to avoid foreign key violation & data consistancy.
36
+ compiled_where_conditions = select_query_ast.
37
+ where_clauses.
38
+ select { |where| where.is_a?(Exwiw::QueryAst::WhereClause) }.
39
+ map do |where|
40
+ compile_where_condition(where, select_query_ast.from_table_name)
41
+ end
42
+
43
+ if compiled_where_conditions.size > 0
44
+ sql += "\nWHERE "
45
+ sql += compiled_where_conditions.join(' AND ')
46
+ end
47
+ sql += ";"
48
+
49
+ return sql
50
+ end
51
+
52
+ subquery_ast = Exwiw::QueryAst::Select.new
53
+ first_join = select_query_ast.join_clauses.first.clone
54
+
55
+ subquery_ast.from(first_join.join_table_name)
56
+ primay_key_col = table.columns.find { |col| col.name == table.primary_key }
57
+ subquery_ast.select([primay_key_col])
58
+ select_query_ast.join_clauses[1..].each do |join|
59
+ subquery_ast.join(join)
60
+ end
61
+ first_join.where_clauses.each do |where|
62
+ # Ignore filter option, because bulk delete is for cleaning before import,
63
+ # so it should delete all records to avoid foreign key violation & data consistancy.
64
+ subquery_ast.where(where) if where.is_a?(Exwiw::QueryAst::WhereClause)
65
+ end
66
+
67
+ foreign_key = first_join.foreign_key
68
+ subquery_sql = compile_ast(subquery_ast)
69
+ sql += "\nWHERE #{select_query_ast.from_table_name}.#{foreign_key} IN (#{subquery_sql});"
70
+
71
+ sql
72
+ end
73
+
74
+ def compile_ast(query_ast)
75
+ raise NotImplementedError unless query_ast.is_a?(Exwiw::QueryAst::Select)
76
+
77
+ sql = "SELECT "
78
+ sql += query_ast.columns.map { |col| compile_column_name(query_ast, col) }.join(', ')
79
+ sql += " FROM #{query_ast.from_table_name}"
80
+
81
+ query_ast.join_clauses.each do |join|
82
+ sql += " JOIN #{join.join_table_name} ON #{query_ast.from_table_name}.#{join.foreign_key} = #{join.join_table_name}.#{join.primary_key}"
83
+
84
+ join.where_clauses.each do |where|
85
+ compiled_where_condition = compile_where_condition(where, join.join_table_name)
86
+ sql += " AND #{compiled_where_condition}"
87
+ end
88
+ end
89
+
90
+ if query_ast.where_clauses.any?
91
+ sql += " WHERE "
92
+ sql += query_ast.where_clauses.map { |where| compile_where_condition(where, query_ast.from_table_name) }.join(' AND ')
93
+ end
94
+
95
+ sql
96
+ end
97
+
98
+ private def compile_where_condition(where_clause, table_name)
99
+ # Use as it is if it's a raw query
100
+ return where_clause if where_clause.is_a?(String)
101
+
102
+ key = "#{table_name}.#{where_clause.column_name}"
103
+
104
+ if where_clause.operator == :eq
105
+ values = where_clause.value.map { |v| escape_value(v) }
106
+
107
+ if values.size == 1
108
+ "#{key} = #{values.first}"
109
+ else
110
+ "#{key} IN (#{values.join(', ')})"
111
+ end
112
+ else
113
+ raise "Unsupported operator: #{where_clause.operator}"
114
+ end
115
+ end
116
+
117
+ private def escape_value(value)
118
+ case value
119
+ when nil
120
+ "NULL"
121
+ when String
122
+ qv = escape_single_quote(value)
123
+ "'#{qv}'"
124
+ else
125
+ value
126
+ end
127
+ end
128
+
129
+ private def escape_single_quote(value)
130
+ value.gsub("'", "''")
131
+ end
132
+
133
+ private def compile_column_name(ast, column)
134
+ case column
135
+ when Exwiw::QueryAst::ColumnValue::Plain
136
+ "#{ast.from_table_name}.#{column.name}"
137
+ when Exwiw::QueryAst::ColumnValue::RawSql
138
+ column.value
139
+ when Exwiw::QueryAst::ColumnValue::ReplaceWith
140
+ parts = column.value.scan(/[^{}]+|\{[^{}]*\}/).map do |part|
141
+ if part.start_with?('{')
142
+ name = part[1..-2]
143
+ "#{ast.from_table_name}.#{name}"
144
+ else
145
+ "'#{part}'"
146
+ end
147
+ end
148
+
149
+ replaced = parts.join(", ")
150
+ "CONCAT(#{replaced})"
151
+ else
152
+ raise "Unreachable case: #{column.inspect}"
153
+ end
154
+ end
155
+
156
+ private def connection
157
+ @connection ||=
158
+ begin
159
+ require 'mysql2'
160
+ Mysql2::Client.new(
161
+ host: @connection_config.host,
162
+ port: @connection_config.port,
163
+ username: @connection_config.user,
164
+ password: @connection_config.password,
165
+ database: @connection_config.database_name
166
+ )
167
+ end
168
+ end
169
+ end
170
+ end
171
+ end
@@ -0,0 +1,171 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Exwiw
4
+ module Adapter
5
+ class PostgresqlAdapter < Base
6
+ def execute(query_ast)
7
+ sql = compile_ast(query_ast)
8
+
9
+ @logger.debug(" Executing SQL: \n#{sql}")
10
+ connection.exec(sql).values
11
+ end
12
+
13
+ def to_bulk_insert(results, table)
14
+ table_name = table.name
15
+
16
+ value_list = results.map do |row|
17
+ quoted_values = row.map do |value|
18
+ escape_value(value)
19
+ end
20
+ "(" + quoted_values.join(', ') + ")"
21
+ end
22
+ values = value_list.join(",\n")
23
+
24
+ column_names = table.columns.map(&:name).join(', ')
25
+ "INSERT INTO #{table_name} (#{column_names}) VALUES\n#{values};"
26
+ end
27
+
28
+ def to_bulk_delete(select_query_ast, table)
29
+ raise NotImplementedError unless select_query_ast.is_a?(Exwiw::QueryAst::Select)
30
+
31
+ sql = "DELETE FROM #{select_query_ast.from_table_name}"
32
+
33
+ if select_query_ast.join_clauses.empty?
34
+ # Ignore filter option, because bulk delete is for cleaning before import,
35
+ # so it should delete all records to avoid foreign key violation & data consistancy.
36
+ compiled_where_conditions = select_query_ast.
37
+ where_clauses.
38
+ select { |where| where.is_a?(Exwiw::QueryAst::WhereClause) }.
39
+ map do |where|
40
+ compile_where_condition(where, select_query_ast.from_table_name)
41
+ end
42
+
43
+ if compiled_where_conditions.size > 0
44
+ sql += "\nWHERE "
45
+ sql += compiled_where_conditions.join(' AND ')
46
+ end
47
+ sql += ";"
48
+
49
+ return sql
50
+ end
51
+
52
+ subquery_ast = Exwiw::QueryAst::Select.new
53
+ first_join = select_query_ast.join_clauses.first.clone
54
+
55
+ subquery_ast.from(first_join.join_table_name)
56
+ primay_key_col = table.columns.find { |col| col.name == table.primary_key }
57
+ subquery_ast.select([primay_key_col])
58
+ select_query_ast.join_clauses[1..].each do |join|
59
+ subquery_ast.join(join)
60
+ end
61
+ first_join.where_clauses.each do |where|
62
+ # Ignore filter option, because bulk delete is for cleaning before import,
63
+ # so it should delete all records to avoid foreign key violation & data consistancy.
64
+ subquery_ast.where(where) if where.is_a?(Exwiw::QueryAst::WhereClause)
65
+ end
66
+
67
+ foreign_key = first_join.foreign_key
68
+ subquery_sql = compile_ast(subquery_ast)
69
+ sql += "\nWHERE #{select_query_ast.from_table_name}.#{foreign_key} IN (#{subquery_sql});"
70
+
71
+ sql
72
+ end
73
+
74
+ def compile_ast(query_ast)
75
+ raise NotImplementedError unless query_ast.is_a?(Exwiw::QueryAst::Select)
76
+
77
+ sql = "SELECT "
78
+ sql += query_ast.columns.map { |col| compile_column_name(query_ast, col) }.join(', ')
79
+ sql += " FROM #{query_ast.from_table_name}"
80
+
81
+ query_ast.join_clauses.each do |join|
82
+ sql += " JOIN #{join.join_table_name} ON #{query_ast.from_table_name}.#{join.foreign_key} = #{join.join_table_name}.#{join.primary_key}"
83
+
84
+ join.where_clauses.each do |where|
85
+ compiled_where_condition = compile_where_condition(where, join.join_table_name)
86
+ sql += " AND #{compiled_where_condition}"
87
+ end
88
+ end
89
+
90
+ if query_ast.where_clauses.any?
91
+ sql += " WHERE "
92
+ sql += query_ast.where_clauses.map { |where| compile_where_condition(where, query_ast.from_table_name) }.join(' AND ')
93
+ end
94
+
95
+ sql
96
+ end
97
+
98
+ private def compile_where_condition(where_clause, table_name)
99
+ # Use as it is if it's a raw query
100
+ return where_clause if where_clause.is_a?(String)
101
+
102
+ key = "#{table_name}.#{where_clause.column_name}"
103
+
104
+ if where_clause.operator == :eq
105
+ values = where_clause.value.map { |v| escape_value(v) }
106
+
107
+ if values.size == 1
108
+ "#{key} = #{values.first}"
109
+ else
110
+ "#{key} IN (#{values.join(', ')})"
111
+ end
112
+ else
113
+ raise "Unsupported operator: #{where_clause.operator}"
114
+ end
115
+ end
116
+
117
+ private def escape_value(value)
118
+ case value
119
+ when nil
120
+ "NULL"
121
+ when String
122
+ qv = escape_single_quote(value)
123
+ "'#{qv}'"
124
+ else
125
+ value
126
+ end
127
+ end
128
+
129
+ private def escape_single_quote(value)
130
+ value.gsub("'", "''")
131
+ end
132
+
133
+ private def compile_column_name(ast, column)
134
+ case column
135
+ when Exwiw::QueryAst::ColumnValue::Plain
136
+ "#{ast.from_table_name}.#{column.name}"
137
+ when Exwiw::QueryAst::ColumnValue::RawSql
138
+ column.value
139
+ when Exwiw::QueryAst::ColumnValue::ReplaceWith
140
+ parts = column.value.scan(/[^{}]+|\{[^{}]*\}/).map do |part|
141
+ if part.start_with?('{')
142
+ name = part[1..-2]
143
+ "#{ast.from_table_name}.#{name}"
144
+ else
145
+ "'#{part}'"
146
+ end
147
+ end
148
+
149
+ replaced = parts.join(", ")
150
+ "CONCAT(#{replaced})"
151
+ else
152
+ raise "Unreachable case: #{column.inspect}"
153
+ end
154
+ end
155
+
156
+ private def connection
157
+ @connection ||=
158
+ begin
159
+ require 'pg'
160
+ PG.connect(
161
+ host: @connection_config.host,
162
+ port: @connection_config.port,
163
+ user: @connection_config.user,
164
+ password: @connection_config.password,
165
+ dbname: @connection_config.database_name
166
+ )
167
+ end
168
+ end
169
+ end
170
+ end
171
+ end