exwiw 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 0aa1977437fc4e44349ecc11431e4f8697acb1471e159777c629850a52de1664
4
+ data.tar.gz: cd92c7a05f6958d2f0cc5ee1204b30fe7b6bd6e168aa18c0b2b182da3b00af9a
5
+ SHA512:
6
+ metadata.gz: e979a144ac442f73483c23c93cb335742b43e9d198811788edb9a416a3d70cd3b23b64ab0783cf20b4452afa00d46827aaed629d021d880172d373491cc9ec96
7
+ data.tar.gz: 35375bb916081981ff264dd74a53b59638c78ad0e2f4dfb40d8716bec59caa387dc114001234be8bb2ba20a3345ecac9d88a1b7b0322ae38b8d92759c61efa75
data/CHANGELOG.md ADDED
@@ -0,0 +1,5 @@
1
+ ## [Unreleased]
2
+
3
+ ## [0.1.0] - 2025-01-31
4
+
5
+ - Initial release
data/LICENSE.txt ADDED
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2025 Shia
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,183 @@
1
+ # Exwiw
2
+
3
+ Export What I Want (Exwiw) is a Ruby gem that allows you to export records from a database to a dump file(to specifically, the full list of INSERT sql) on the specified conditions.
4
+
5
+ ## When to use
6
+
7
+ Most of case in developing a software, There is no better choice than the same data in production.
8
+ You might make well-crafted data, but it's very very hard to maintain.
9
+
10
+ If you find the way to maintain the data for develoment env, then exwiw might be a solution for that.
11
+
12
+ - Export the full database and mask data and import to another database.
13
+ - Setup some system to replicate and mask data in real-time to another database.
14
+
15
+
16
+ You want to export only the data you want to export.
17
+
18
+ ## Features
19
+
20
+ - Export the full list of INSERT sql for the specified conditions.
21
+ - Provide serveral masking options for sensitive columns.
22
+ - Provide config generator for ActiveRecord.
23
+
24
+ ## Installation
25
+
26
+ ```bash
27
+ bundle add exwiw
28
+ ```
29
+
30
+ Most of cases, you want to add 'require: false' to the Gemfile.
31
+
32
+ If bundler is not being used to manage dependencies, install the gem by executing:
33
+
34
+ ```bash
35
+ gem install exwiw
36
+ ```
37
+
38
+ ## Supported Databases
39
+
40
+ - mysql2
41
+ - postgresql
42
+ - sqlite3
43
+
44
+ ## Usage
45
+
46
+ ### Command
47
+
48
+ ```bash
49
+ # dump & masking all records from database to dump.sql based on schema.json
50
+ # pass database password as an environment variable 'DATABASE_PASSWORD'
51
+ exwiw \
52
+ --adapter=mysql2 \
53
+ --host=localhost \
54
+ --port=3306 \
55
+ --user=reader \
56
+ --database=app_production \
57
+ --config-dir=exwiw \
58
+ --target-table=shops \
59
+ --ids=1 \ # comma separated ids
60
+ --output-dir=dump \
61
+ --log-level=info
62
+ ```
63
+
64
+ This command will generate sql files in the `dump` directory.
65
+
66
+ - `dump/insert-{idx}-{table_name}.sql`
67
+ - `dump/delete-{idx}-{table_name}.sql`
68
+
69
+ idx means the order of the dump. bigger idx might depend on smaller idx,
70
+ so you should import the dump in order.
71
+
72
+ you need to delete the records before importing the dump,
73
+ `delete-{idx}-{table_name}.sql` will help you to do that.
74
+ This sql will delete "all" related records to the extract targets.
75
+ idx meaning is the same as insert sql.
76
+
77
+ ### Generator
78
+
79
+ the config generator is provided as Rake task.
80
+
81
+ ```bash
82
+ # generate table schema under exwiw/
83
+ bundle exec rake exwiw:schema:generate
84
+ ```
85
+
86
+ ### Configuration
87
+
88
+ This is an example of the one table schema:
89
+
90
+ ```json
91
+ {
92
+ "name": "users",
93
+ "primary_key": "id",
94
+ "filter": "users.id > 0",
95
+ "belongs_to": [{
96
+ "name": "companies",
97
+ "foreign_key": "company_id"
98
+ }],
99
+ "columns": [{
100
+ "name": "id",
101
+ }, {
102
+ "name": "email",
103
+ "replace_with": "user{id}@example.com"
104
+ }, {
105
+ "name": "company_id"
106
+ }]
107
+ }
108
+ ```
109
+
110
+ `--config-dir` will use all json files in the specified directory.
111
+
112
+ ### Filter
113
+
114
+ Some case, you don't need full records related to target. e.g. dump user access logs only for the last year.
115
+ `filter` is here for that. Be careful to use this option, as it will be:
116
+
117
+ - injected as it is in table condition(e.g. WHERE on mysql), so you are recommended to clearify table name of column to avoid ambiguity.
118
+ - injected to every where / join clause, so it affects to all tables depends on filterted target-table. it results to data inconsistency.
119
+
120
+ ### Masking
121
+
122
+ `exwiw` provides several options for masking value.
123
+
124
+ #### `replace_with`
125
+
126
+ It will replace the value with the specified string,
127
+ and you can use the column name with `{}` to replace the value with the column value.
128
+
129
+ For example, Let assume we have the record which id is 1,
130
+ then "user{id}@example.com" will be replaced with "user1@example.com".
131
+
132
+ #### `raw_sql`
133
+
134
+ It will used instead of the original value.
135
+
136
+ For example, `"raw_sql": "CONCAT('user', shops.id, '@example.com')"` is equivalent to
137
+ `"replace_with": "user{id}@example.com"`.
138
+ This is useful when you want to transform with functions provided by the database.
139
+
140
+ Notice that you are recommended to clearify table name of column to avoid ambiguity.
141
+
142
+ If it used with `replace_with`, `replace_with` will be ignored.
143
+
144
+ #### `map`
145
+
146
+ XXX: TODO
147
+
148
+ Given value will be evaluated as Ruby code, and treated as the proc.
149
+
150
+ ```
151
+ "map": "proc { |r| 'user' + v['id'].to_s + '@example.com' }"
152
+ ```
153
+
154
+ which is equivalent to `"replace_with": "user{id}@example.com"`.
155
+
156
+ Notice this is the most powerful option, but you should be careful to use this option.
157
+ Because this transformation occured on exwiw process, so much slower than other options.
158
+ Most of case, this option is not recommended.
159
+
160
+ ## How it works
161
+
162
+ - Load the table information from the specified config file.
163
+ - Calculate the dependency between tables.
164
+ - Generate the full list of INSERT sql based on the specified conditions.
165
+ - If the processing table has no relation with target tables, then dump all records.
166
+ - If the processing table has relation with target tables, then dump the records which are related to the target tables.
167
+ - Generate the full list of DELETE sql based on the specified conditions.
168
+ - If the processing table has no relation with target tables, then delete all records.
169
+ - If the processing table has relation with target tables, then delete the records which are related to the target tables.
170
+
171
+ ## Development
172
+
173
+ After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
174
+
175
+ To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and the created tag, and push the `.gem` file to [rubygems.org](https://rubygems.org).
176
+
177
+ ## Contributing
178
+
179
+ Bug reports and pull requests are welcome on GitHub at https://github.com/riseshia/exwiw.
180
+
181
+ ## License
182
+
183
+ The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
data/exe/exwiw ADDED
@@ -0,0 +1,6 @@
1
+ #!/usr/bin/env ruby
2
+ # frozen_string_literal: true
3
+
4
+ require 'exwiw/cli'
5
+
6
+ Exwiw::CLI.start(ARGV)
@@ -0,0 +1,171 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Exwiw
4
+ module Adapter
5
+ class Mysql2Adapter < Base
6
+ def execute(query_ast)
7
+ sql = compile_ast(query_ast)
8
+
9
+ @logger.debug(" Executing SQL: \n#{sql}")
10
+ connection.query(sql, cast: false, as: :array).to_a
11
+ end
12
+
13
+ def to_bulk_insert(results, table)
14
+ table_name = table.name
15
+
16
+ value_list = results.map do |row|
17
+ quoted_values = row.map do |value|
18
+ escape_value(value)
19
+ end
20
+ "(" + quoted_values.join(', ') + ")"
21
+ end
22
+ values = value_list.join(",\n")
23
+
24
+ column_names = table.columns.map(&:name).join(', ')
25
+ "INSERT INTO #{table_name} (#{column_names}) VALUES\n#{values};"
26
+ end
27
+
28
+ def to_bulk_delete(select_query_ast, table)
29
+ raise NotImplementedError unless select_query_ast.is_a?(Exwiw::QueryAst::Select)
30
+
31
+ sql = "DELETE FROM #{select_query_ast.from_table_name}"
32
+
33
+ if select_query_ast.join_clauses.empty?
34
+ # Ignore filter option, because bulk delete is for cleaning before import,
35
+ # so it should delete all records to avoid foreign key violation & data consistancy.
36
+ compiled_where_conditions = select_query_ast.
37
+ where_clauses.
38
+ select { |where| where.is_a?(Exwiw::QueryAst::WhereClause) }.
39
+ map do |where|
40
+ compile_where_condition(where, select_query_ast.from_table_name)
41
+ end
42
+
43
+ if compiled_where_conditions.size > 0
44
+ sql += "\nWHERE "
45
+ sql += compiled_where_conditions.join(' AND ')
46
+ end
47
+ sql += ";"
48
+
49
+ return sql
50
+ end
51
+
52
+ subquery_ast = Exwiw::QueryAst::Select.new
53
+ first_join = select_query_ast.join_clauses.first.clone
54
+
55
+ subquery_ast.from(first_join.join_table_name)
56
+ primay_key_col = table.columns.find { |col| col.name == table.primary_key }
57
+ subquery_ast.select([primay_key_col])
58
+ select_query_ast.join_clauses[1..].each do |join|
59
+ subquery_ast.join(join)
60
+ end
61
+ first_join.where_clauses.each do |where|
62
+ # Ignore filter option, because bulk delete is for cleaning before import,
63
+ # so it should delete all records to avoid foreign key violation & data consistancy.
64
+ subquery_ast.where(where) if where.is_a?(Exwiw::QueryAst::WhereClause)
65
+ end
66
+
67
+ foreign_key = first_join.foreign_key
68
+ subquery_sql = compile_ast(subquery_ast)
69
+ sql += "\nWHERE #{select_query_ast.from_table_name}.#{foreign_key} IN (#{subquery_sql});"
70
+
71
+ sql
72
+ end
73
+
74
+ def compile_ast(query_ast)
75
+ raise NotImplementedError unless query_ast.is_a?(Exwiw::QueryAst::Select)
76
+
77
+ sql = "SELECT "
78
+ sql += query_ast.columns.map { |col| compile_column_name(query_ast, col) }.join(', ')
79
+ sql += " FROM #{query_ast.from_table_name}"
80
+
81
+ query_ast.join_clauses.each do |join|
82
+ sql += " JOIN #{join.join_table_name} ON #{query_ast.from_table_name}.#{join.foreign_key} = #{join.join_table_name}.#{join.primary_key}"
83
+
84
+ join.where_clauses.each do |where|
85
+ compiled_where_condition = compile_where_condition(where, join.join_table_name)
86
+ sql += " AND #{compiled_where_condition}"
87
+ end
88
+ end
89
+
90
+ if query_ast.where_clauses.any?
91
+ sql += " WHERE "
92
+ sql += query_ast.where_clauses.map { |where| compile_where_condition(where, query_ast.from_table_name) }.join(' AND ')
93
+ end
94
+
95
+ sql
96
+ end
97
+
98
+ private def compile_where_condition(where_clause, table_name)
99
+ # Use as it is if it's a raw query
100
+ return where_clause if where_clause.is_a?(String)
101
+
102
+ key = "#{table_name}.#{where_clause.column_name}"
103
+
104
+ if where_clause.operator == :eq
105
+ values = where_clause.value.map { |v| escape_value(v) }
106
+
107
+ if values.size == 1
108
+ "#{key} = #{values.first}"
109
+ else
110
+ "#{key} IN (#{values.join(', ')})"
111
+ end
112
+ else
113
+ raise "Unsupported operator: #{where_clause.operator}"
114
+ end
115
+ end
116
+
117
+ private def escape_value(value)
118
+ case value
119
+ when nil
120
+ "NULL"
121
+ when String
122
+ qv = escape_single_quote(value)
123
+ "'#{qv}'"
124
+ else
125
+ value
126
+ end
127
+ end
128
+
129
+ private def escape_single_quote(value)
130
+ value.gsub("'", "''")
131
+ end
132
+
133
+ private def compile_column_name(ast, column)
134
+ case column
135
+ when Exwiw::QueryAst::ColumnValue::Plain
136
+ "#{ast.from_table_name}.#{column.name}"
137
+ when Exwiw::QueryAst::ColumnValue::RawSql
138
+ column.value
139
+ when Exwiw::QueryAst::ColumnValue::ReplaceWith
140
+ parts = column.value.scan(/[^{}]+|\{[^{}]*\}/).map do |part|
141
+ if part.start_with?('{')
142
+ name = part[1..-2]
143
+ "#{ast.from_table_name}.#{name}"
144
+ else
145
+ "'#{part}'"
146
+ end
147
+ end
148
+
149
+ replaced = parts.join(", ")
150
+ "CONCAT(#{replaced})"
151
+ else
152
+ raise "Unreachable case: #{column.inspect}"
153
+ end
154
+ end
155
+
156
+ private def connection
157
+ @connection ||=
158
+ begin
159
+ require 'mysql2'
160
+ Mysql2::Client.new(
161
+ host: @connection_config.host,
162
+ port: @connection_config.port,
163
+ username: @connection_config.user,
164
+ password: @connection_config.password,
165
+ database: @connection_config.database_name
166
+ )
167
+ end
168
+ end
169
+ end
170
+ end
171
+ end
@@ -0,0 +1,171 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Exwiw
4
+ module Adapter
5
+ class PostgresqlAdapter < Base
6
+ def execute(query_ast)
7
+ sql = compile_ast(query_ast)
8
+
9
+ @logger.debug(" Executing SQL: \n#{sql}")
10
+ connection.exec(sql).values
11
+ end
12
+
13
+ def to_bulk_insert(results, table)
14
+ table_name = table.name
15
+
16
+ value_list = results.map do |row|
17
+ quoted_values = row.map do |value|
18
+ escape_value(value)
19
+ end
20
+ "(" + quoted_values.join(', ') + ")"
21
+ end
22
+ values = value_list.join(",\n")
23
+
24
+ column_names = table.columns.map(&:name).join(', ')
25
+ "INSERT INTO #{table_name} (#{column_names}) VALUES\n#{values};"
26
+ end
27
+
28
+ def to_bulk_delete(select_query_ast, table)
29
+ raise NotImplementedError unless select_query_ast.is_a?(Exwiw::QueryAst::Select)
30
+
31
+ sql = "DELETE FROM #{select_query_ast.from_table_name}"
32
+
33
+ if select_query_ast.join_clauses.empty?
34
+ # Ignore filter option, because bulk delete is for cleaning before import,
35
+ # so it should delete all records to avoid foreign key violation & data consistancy.
36
+ compiled_where_conditions = select_query_ast.
37
+ where_clauses.
38
+ select { |where| where.is_a?(Exwiw::QueryAst::WhereClause) }.
39
+ map do |where|
40
+ compile_where_condition(where, select_query_ast.from_table_name)
41
+ end
42
+
43
+ if compiled_where_conditions.size > 0
44
+ sql += "\nWHERE "
45
+ sql += compiled_where_conditions.join(' AND ')
46
+ end
47
+ sql += ";"
48
+
49
+ return sql
50
+ end
51
+
52
+ subquery_ast = Exwiw::QueryAst::Select.new
53
+ first_join = select_query_ast.join_clauses.first.clone
54
+
55
+ subquery_ast.from(first_join.join_table_name)
56
+ primay_key_col = table.columns.find { |col| col.name == table.primary_key }
57
+ subquery_ast.select([primay_key_col])
58
+ select_query_ast.join_clauses[1..].each do |join|
59
+ subquery_ast.join(join)
60
+ end
61
+ first_join.where_clauses.each do |where|
62
+ # Ignore filter option, because bulk delete is for cleaning before import,
63
+ # so it should delete all records to avoid foreign key violation & data consistancy.
64
+ subquery_ast.where(where) if where.is_a?(Exwiw::QueryAst::WhereClause)
65
+ end
66
+
67
+ foreign_key = first_join.foreign_key
68
+ subquery_sql = compile_ast(subquery_ast)
69
+ sql += "\nWHERE #{select_query_ast.from_table_name}.#{foreign_key} IN (#{subquery_sql});"
70
+
71
+ sql
72
+ end
73
+
74
+ def compile_ast(query_ast)
75
+ raise NotImplementedError unless query_ast.is_a?(Exwiw::QueryAst::Select)
76
+
77
+ sql = "SELECT "
78
+ sql += query_ast.columns.map { |col| compile_column_name(query_ast, col) }.join(', ')
79
+ sql += " FROM #{query_ast.from_table_name}"
80
+
81
+ query_ast.join_clauses.each do |join|
82
+ sql += " JOIN #{join.join_table_name} ON #{query_ast.from_table_name}.#{join.foreign_key} = #{join.join_table_name}.#{join.primary_key}"
83
+
84
+ join.where_clauses.each do |where|
85
+ compiled_where_condition = compile_where_condition(where, join.join_table_name)
86
+ sql += " AND #{compiled_where_condition}"
87
+ end
88
+ end
89
+
90
+ if query_ast.where_clauses.any?
91
+ sql += " WHERE "
92
+ sql += query_ast.where_clauses.map { |where| compile_where_condition(where, query_ast.from_table_name) }.join(' AND ')
93
+ end
94
+
95
+ sql
96
+ end
97
+
98
+ private def compile_where_condition(where_clause, table_name)
99
+ # Use as it is if it's a raw query
100
+ return where_clause if where_clause.is_a?(String)
101
+
102
+ key = "#{table_name}.#{where_clause.column_name}"
103
+
104
+ if where_clause.operator == :eq
105
+ values = where_clause.value.map { |v| escape_value(v) }
106
+
107
+ if values.size == 1
108
+ "#{key} = #{values.first}"
109
+ else
110
+ "#{key} IN (#{values.join(', ')})"
111
+ end
112
+ else
113
+ raise "Unsupported operator: #{where_clause.operator}"
114
+ end
115
+ end
116
+
117
+ private def escape_value(value)
118
+ case value
119
+ when nil
120
+ "NULL"
121
+ when String
122
+ qv = escape_single_quote(value)
123
+ "'#{qv}'"
124
+ else
125
+ value
126
+ end
127
+ end
128
+
129
+ private def escape_single_quote(value)
130
+ value.gsub("'", "''")
131
+ end
132
+
133
+ private def compile_column_name(ast, column)
134
+ case column
135
+ when Exwiw::QueryAst::ColumnValue::Plain
136
+ "#{ast.from_table_name}.#{column.name}"
137
+ when Exwiw::QueryAst::ColumnValue::RawSql
138
+ column.value
139
+ when Exwiw::QueryAst::ColumnValue::ReplaceWith
140
+ parts = column.value.scan(/[^{}]+|\{[^{}]*\}/).map do |part|
141
+ if part.start_with?('{')
142
+ name = part[1..-2]
143
+ "#{ast.from_table_name}.#{name}"
144
+ else
145
+ "'#{part}'"
146
+ end
147
+ end
148
+
149
+ replaced = parts.join(", ")
150
+ "CONCAT(#{replaced})"
151
+ else
152
+ raise "Unreachable case: #{column.inspect}"
153
+ end
154
+ end
155
+
156
+ private def connection
157
+ @connection ||=
158
+ begin
159
+ require 'pg'
160
+ PG.connect(
161
+ host: @connection_config.host,
162
+ port: @connection_config.port,
163
+ user: @connection_config.user,
164
+ password: @connection_config.password,
165
+ dbname: @connection_config.database_name
166
+ )
167
+ end
168
+ end
169
+ end
170
+ end
171
+ end