exwiw 0.1.9 → 0.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +14 -0
- data/README.md +77 -16
- data/docs/plans/2026-05-22-after-insert-hook.md +189 -0
- data/docs/plans/2026-05-22-postgres-copy-mode-scenario-test.md +91 -0
- data/lib/exwiw/adapter/mongodb_adapter.rb +4 -0
- data/lib/exwiw/adapter/mysql2_adapter.rb +11 -0
- data/lib/exwiw/adapter/postgresql_adapter.rb +7 -0
- data/lib/exwiw/adapter/sqlite3_adapter.rb +8 -0
- data/lib/exwiw/adapter.rb +7 -0
- data/lib/exwiw/after_insert_hook.rb +63 -0
- data/lib/exwiw/cli.rb +129 -29
- data/lib/exwiw/explain_runner.rb +83 -0
- data/lib/exwiw/mongodb_collection_config.rb +1 -0
- data/lib/exwiw/runner.rb +48 -2
- data/lib/exwiw/table_config.rb +2 -0
- data/lib/exwiw/version.rb +1 -1
- data/lib/exwiw.rb +2 -0
- metadata +5 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 4855b3fc49afc6cc69606579eb15c0cf430c4123024ec8e9b26a5215989292d3
|
|
4
|
+
data.tar.gz: 1ee99545e00cb43292c59b918dc24d3b4f764427f8708fca773c0b507f6c1e2d
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: fcbe518ae634e294bd48e61088dd4bd0f1012b317b08d921bcbd7ee95e4e595c69c9df59ad18593dbcee35b017c1fa6a198102f27757183b804218920ff64cbb
|
|
7
|
+
data.tar.gz: 48475293bc58a6f32ec9edc68c3bdc0bdd1427233ec4cd2366fb656e5fdd1e050a260056fc69e73c5735eeb8cb6b457ef7244a13589946430f138359cc050911
|
data/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,20 @@
|
|
|
2
2
|
|
|
3
3
|
## [Unreleased]
|
|
4
4
|
|
|
5
|
+
## [0.2.1] - 2026-05-23
|
|
6
|
+
|
|
7
|
+
### Added
|
|
8
|
+
|
|
9
|
+
- `skip: true` table config attribute to explicitly exclude a table from the dump. Skipped tables produce no schema entry, no `insert-*` file, and no `delete-*` file. Using a skipped table as `--target-table`, or having another non-skipped table reference it via `belongs_to`, raises `ArgumentError` on load. Available for both SQL adapters (`TableConfig`) and the MongoDB adapter (`MongodbCollectionConfig`). ([#26](https://github.com/heyinc/exwiw/pull/26))
|
|
10
|
+
- `dump` / `explain` subcommands. `dump` is the default and preserves the existing behavior when no subcommand is given. `explain` prints the compiled SQL and its `EXPLAIN` output (estimate-only — `EXPLAIN QUERY PLAN` on SQLite) for each extraction query to stdout without executing the SELECTs. Supported for `mysql2`, `postgresql`, and `sqlite3`; `mongodb` is not yet supported. ([#28](https://github.com/heyinc/exwiw/pull/28))
|
|
11
|
+
|
|
12
|
+
## [0.2.0] - 2026-05-22
|
|
13
|
+
|
|
14
|
+
### Added
|
|
15
|
+
|
|
16
|
+
- `--insert-only` CLI flag to skip generating `delete-*.sql` files. ([#23](https://github.com/heyinc/exwiw/pull/23))
|
|
17
|
+
- `--after-insert-hook` CLI flag to run a hook after per-table insert/delete files are generated. `.rb` hooks evaluate `insert_sql` DSL via ERB and write the result to `insert-{N+1}-after_insert.{ext}`; other executables run as a child process with `EXWIW_*` environment variables for pure side-effect hooks. ([#24](https://github.com/heyinc/exwiw/pull/24))
|
|
18
|
+
|
|
5
19
|
## [0.1.9] - 2026-05-21
|
|
6
20
|
|
|
7
21
|
Added
|
data/README.md
CHANGED
|
@@ -44,7 +44,12 @@ gem install exwiw
|
|
|
44
44
|
|
|
45
45
|
## Usage
|
|
46
46
|
|
|
47
|
-
|
|
47
|
+
exwiw has two subcommands:
|
|
48
|
+
|
|
49
|
+
- `dump` (default) — generate INSERT/COPY SQL files. This is the existing behavior; if the subcommand is omitted, `dump` is assumed for backwards compatibility.
|
|
50
|
+
- `explain` — print the compiled SQL and its `EXPLAIN` output for each query that `dump` would run, without executing the SELECTs.
|
|
51
|
+
|
|
52
|
+
### `exwiw dump`
|
|
48
53
|
|
|
49
54
|
```bash
|
|
50
55
|
# dump & masking all records from database to dump.sql based on schema.json
|
|
@@ -92,6 +97,22 @@ you need to delete the records before importing the dump,
|
|
|
92
97
|
This sql will delete "all" related records to the extract targets.
|
|
93
98
|
idx meaning is the same as insert sql.
|
|
94
99
|
|
|
100
|
+
### `exwiw explain`
|
|
101
|
+
|
|
102
|
+
Print the compiled SQL and its `EXPLAIN` output (estimate-only; `EXPLAIN QUERY PLAN` on SQLite) for each query that `dump` would run, to stdout. No SELECT is executed. Supported for `mysql2`, `postgresql`, and `sqlite3`. The `mongodb` adapter is not yet supported.
|
|
103
|
+
|
|
104
|
+
```bash
|
|
105
|
+
# preview the queries exwiw would run, without executing the SELECTs
|
|
106
|
+
exwiw explain \
|
|
107
|
+
--adapter=postgresql \
|
|
108
|
+
--host=localhost --port=5432 --user=reader \
|
|
109
|
+
--database=app_production \
|
|
110
|
+
--config-dir=exwiw \
|
|
111
|
+
--target-table=shops --ids=1
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
The `--output-dir`, `--output-format`, `--insert-only`, and `--after-insert-hook` options are dump-specific and rejected when used with `explain`.
|
|
115
|
+
|
|
95
116
|
### Generator
|
|
96
117
|
|
|
97
118
|
The config generator is provided as a Rake task.
|
|
@@ -136,21 +157,7 @@ This is an example of the one table schema:
|
|
|
136
157
|
|
|
137
158
|
### Output format
|
|
138
159
|
|
|
139
|
-
By default, exwiw generates `INSERT` statements. For PostgreSQL, you can
|
|
140
|
-
|
|
141
|
-
```bash
|
|
142
|
-
exwiw \
|
|
143
|
-
--adapter=postgresql \
|
|
144
|
-
--host=localhost \
|
|
145
|
-
--port=5432 \
|
|
146
|
-
--user=reader \
|
|
147
|
-
--database=app_production \
|
|
148
|
-
--config-dir=exwiw \
|
|
149
|
-
--target-table=shops \
|
|
150
|
-
--ids=1 \
|
|
151
|
-
--output-dir=dump \
|
|
152
|
-
--output-format=copy
|
|
153
|
-
```
|
|
160
|
+
By default, exwiw generates `INSERT` statements. For PostgreSQL, you can pass `--output-format=copy` to generate `COPY FROM stdin` format instead, which is significantly faster for bulk loading.
|
|
154
161
|
|
|
155
162
|
The generated file uses tab-separated values with PostgreSQL's text-format escaping (`\N` for NULL, `\\` for backslash, etc.). Import with `psql`:
|
|
156
163
|
|
|
@@ -160,6 +167,60 @@ psql -d app_dev -f dump/insert-001-shops.sql
|
|
|
160
167
|
|
|
161
168
|
`--output-format=copy` is only supported with the `postgresql` adapter.
|
|
162
169
|
|
|
170
|
+
### Skip DELETE SQL output
|
|
171
|
+
|
|
172
|
+
By default, exwiw generates `delete-*.sql` files alongside the `insert-*.sql` files so that an existing dataset can be cleared before re-inserting. Pass `--insert-only` when you only need the insert files.
|
|
173
|
+
|
|
174
|
+
### After-insert hook
|
|
175
|
+
|
|
176
|
+
`--after-insert-hook=PATH` runs a post-processing hook **after** all per-table insert/delete files have been written. The hook can be either a Ruby file (`.rb`) or any executable script (e.g. `.sh`).
|
|
177
|
+
|
|
178
|
+
**Ruby hook (`.rb`)**: provides a tiny DSL with two builtins:
|
|
179
|
+
|
|
180
|
+
- `cli_options` — Hash of all parsed CLI options (e.g. `cli_options.fetch(:ids)` returns the `--ids` array).
|
|
181
|
+
- `insert_sql(template)` — appends an ERB-rendered string to a buffer. After the hook finishes, the buffer is concatenated and written to `insert-{N+1}-after_insert.{ext}` where `{N+1}` is one past the last per-table insert file. For the MongoDB adapter the equivalent alias `insert_jsonl(template)` is available; output goes to `insert-{N+1}-after_insert.jsonl`. Multiple `insert_sql` calls in a single hook are joined with `"\n"` into the same file. If no `insert_sql` call is made, no file is created.
|
|
182
|
+
|
|
183
|
+
Example `hooks/seed_default_users.rb`:
|
|
184
|
+
|
|
185
|
+
```ruby
|
|
186
|
+
insert_sql <<~SQL
|
|
187
|
+
-- seed default users for tenants <%= cli_options.fetch(:ids).join(',') %>
|
|
188
|
+
<%- cli_options.fetch(:ids).each do |tenant_id| -%>
|
|
189
|
+
INSERT INTO users (tenant_id, email) VALUES (<%= tenant_id %>, 'default@example.com');
|
|
190
|
+
<%- end -%>
|
|
191
|
+
SQL
|
|
192
|
+
```
|
|
193
|
+
|
|
194
|
+
**Shell hook**: anything other than `.rb` is exec'd as a child process. It is a pure side-effect hook — exwiw does not capture its stdout. The hook receives these env vars and inherits `DATABASE_PASSWORD` from the parent:
|
|
195
|
+
|
|
196
|
+
- `EXWIW_OUTPUT_DIR`, `EXWIW_CONFIG_DIR`
|
|
197
|
+
- `EXWIW_DATABASE_ADAPTER`, `EXWIW_DATABASE_HOST`, `EXWIW_DATABASE_PORT`, `EXWIW_DATABASE_USER`, `EXWIW_DATABASE_NAME`
|
|
198
|
+
- `EXWIW_TARGET_TABLE`, `EXWIW_IDS` (comma-separated), `EXWIW_OUTPUT_FORMAT`
|
|
199
|
+
|
|
200
|
+
A non-zero exit code from the shell hook aborts exwiw.
|
|
201
|
+
|
|
202
|
+
Note: Ruby hooks are evaluated via `instance_eval` inside the exwiw process — only pass paths you trust.
|
|
203
|
+
|
|
204
|
+
### Skip a table
|
|
205
|
+
|
|
206
|
+
Set `"skip": true` on a table's config JSON to explicitly exclude it from the dump. The table is omitted from `insert-000-schema.{sql,js}`, and no `insert-*` / `delete-*` files are generated for it. Skipped tables are also not queried at all.
|
|
207
|
+
|
|
208
|
+
```json
|
|
209
|
+
{
|
|
210
|
+
"name": "audit_logs",
|
|
211
|
+
"primary_key": "id",
|
|
212
|
+
"skip": true,
|
|
213
|
+
"belongs_tos": [],
|
|
214
|
+
"columns": [{ "name": "id" }]
|
|
215
|
+
}
|
|
216
|
+
```
|
|
217
|
+
|
|
218
|
+
Constraints:
|
|
219
|
+
|
|
220
|
+
- If another non-skipped table has a `belongs_to` entry pointing at a skipped table, exwiw raises `ArgumentError` on load. Remove the `belongs_to` entry on the referencing table, or unset `skip` on the referenced table.
|
|
221
|
+
- Specifying a skipped table as `--target-table` raises `ArgumentError`.
|
|
222
|
+
- `skip: true` is preserved by `exwiw:schema:generate` regenerations (the receiver value wins over the auto-generated config).
|
|
223
|
+
|
|
163
224
|
### Bulk insert chunk size
|
|
164
225
|
|
|
165
226
|
`bulk_insert_chunk_size` splits the generated `INSERT` statement into multiple statements, each containing at most the specified number of rows. This is useful when the number of records per table is large enough to hit limits like MySQL's `max_allowed_packet`.
|
|
@@ -0,0 +1,189 @@
|
|
|
1
|
+
# Plan: `--after-insert-hook` フック (Ruby DSL / shell script)
|
|
2
|
+
|
|
3
|
+
## Context
|
|
4
|
+
|
|
5
|
+
現状 `exwiw` は `insert-000-schema.{sql,js}` → `insert-NNN-{table}.{sql,jsonl}` → `delete-NNN-...` を生成して終わる。実運用では「import 後に特定テナントへデフォルトユーザを挿入する」「監査ログを 1 行打つ」など、抽出結果を踏まえた**後処理 SQL / 副作用**を続けて流したいケースがある。
|
|
6
|
+
|
|
7
|
+
これを毎回別ファイルとして手で書き足すのは面倒なので、抽出ジョブの一部としてフックを記述できるようにする。フックは `--ids` などの CLI オプションを参照できるべき (例: 「抽出対象のテナント ID 配列に対してデフォルトユーザを seed」)。
|
|
8
|
+
|
|
9
|
+
ゴール:
|
|
10
|
+
|
|
11
|
+
- `--after-insert=PATH` オプションを追加。`PATH` には `.rb` または `.sh` を指定可。
|
|
12
|
+
- 拡張子 `.rb`: 軽量 DSL (`cli_options`, `insert_sql` / `insert_jsonl`) を提供。文字列引数は ERB として評価され、結果が連結されて最後尾の insert ファイルとして書き出される。
|
|
13
|
+
- 拡張子 `.sh` (および `.rb` 以外): 環境変数で CLI オプションを渡したうえで子プロセスとして実行。出力ファイルは生成しない (純粋な副作用フック)。
|
|
14
|
+
- フックは per-table の insert/delete ループ完了後に 1 度だけ実行される。
|
|
15
|
+
|
|
16
|
+
## Design
|
|
17
|
+
|
|
18
|
+
### CLI レイヤー
|
|
19
|
+
**File**: `lib/exwiw/cli.rb`
|
|
20
|
+
|
|
21
|
+
- `@after_insert_path = nil` を初期化 (`initialize`)。
|
|
22
|
+
- `parser` 内に `opts.on("--after-insert=[PATH]", "Path to a .rb or .sh post-processing hook") { |v| @after_insert_path = File.expand_path(v) }` を追加。
|
|
23
|
+
- `validate_options!` で以下を検証:
|
|
24
|
+
- パスが存在しないとき: `$stderr.puts "--after-insert file not found: #{@after_insert_path}"; exit 1`
|
|
25
|
+
- 拡張子が `.rb` でも `.sh` でもなく、かつ実行可能ビットも立っていないとき: `--after-insert must be a .rb file or an executable script` で exit 1。
|
|
26
|
+
- `cli_options` 用に **CLI 全オプションを Hash 化するメソッド** を追加 (`build_cli_options_hash`):
|
|
27
|
+
```ruby
|
|
28
|
+
{
|
|
29
|
+
database_host: @database_host, database_port: @database_port,
|
|
30
|
+
database_user: @database_user, database_password: @database_password,
|
|
31
|
+
output_dir: @output_dir, config_dir: @config_dir,
|
|
32
|
+
database_adapter: @database_adapter, database_name: @database_name,
|
|
33
|
+
target_table: @target_table_name, ids: @ids.dup.freeze,
|
|
34
|
+
output_format: @output_format, insert_only: @insert_only,
|
|
35
|
+
log_level: @log_level, after_insert: @after_insert_path,
|
|
36
|
+
}.freeze
|
|
37
|
+
```
|
|
38
|
+
- `Runner.new(...)` 呼び出しに `after_insert_path: @after_insert_path, cli_options: build_cli_options_hash` を追加。
|
|
39
|
+
|
|
40
|
+
### Runner 統合
|
|
41
|
+
**File**: `lib/exwiw/runner.rb`
|
|
42
|
+
|
|
43
|
+
- `initialize` のキーワード引数に `after_insert_path: nil, cli_options: {}` を追加し instance var に格納。
|
|
44
|
+
- `run` の per-table ループ (`ordered_table_names.each_with_index`) が終わった**直後** (現状の line 98 の直後) で:
|
|
45
|
+
```ruby
|
|
46
|
+
if @after_insert_path
|
|
47
|
+
@logger.info("Running after-insert hook: #{@after_insert_path}")
|
|
48
|
+
AfterInsertHook.run(
|
|
49
|
+
path: @after_insert_path,
|
|
50
|
+
cli_options: @cli_options,
|
|
51
|
+
output_dir: @output_dir,
|
|
52
|
+
next_idx: total_size + 1,
|
|
53
|
+
output_extension: adapter.output_extension,
|
|
54
|
+
logger: @logger,
|
|
55
|
+
)
|
|
56
|
+
end
|
|
57
|
+
```
|
|
58
|
+
- `total_size` は既存変数 (`ordered_table_names.size`)。schema が `000`、per-table が `001..total_size` を使うので、フック出力は `total_size + 1` 番。`delete-*` は逆順番号なので衝突しない。
|
|
59
|
+
|
|
60
|
+
### 新規ファイル: `lib/exwiw/after_insert_hook.rb`
|
|
61
|
+
|
|
62
|
+
```ruby
|
|
63
|
+
require 'erb'
|
|
64
|
+
require 'shellwords'
|
|
65
|
+
|
|
66
|
+
module Exwiw
|
|
67
|
+
class AfterInsertHook
|
|
68
|
+
def self.run(path:, cli_options:, output_dir:, next_idx:, output_extension:, logger:)
|
|
69
|
+
ext = File.extname(path)
|
|
70
|
+
idx_str = next_idx.to_s.rjust(3, '0')
|
|
71
|
+
output_path = File.join(output_dir, "insert-#{idx_str}-after_insert.#{output_extension}")
|
|
72
|
+
|
|
73
|
+
case ext
|
|
74
|
+
when '.rb'
|
|
75
|
+
run_ruby(path: path, cli_options: cli_options, output_path: output_path, logger: logger)
|
|
76
|
+
else
|
|
77
|
+
run_shell(path: path, cli_options: cli_options, output_dir: output_dir, logger: logger)
|
|
78
|
+
end
|
|
79
|
+
end
|
|
80
|
+
|
|
81
|
+
def self.run_ruby(path:, cli_options:, output_path:, logger:)
|
|
82
|
+
ctx = Context.new(cli_options)
|
|
83
|
+
ctx.instance_eval(File.read(path), path)
|
|
84
|
+
sql = ctx.collected.join("\n")
|
|
85
|
+
if sql.empty?
|
|
86
|
+
logger.info("After-insert hook produced no output; skipping file write.")
|
|
87
|
+
return
|
|
88
|
+
end
|
|
89
|
+
File.write(output_path, sql)
|
|
90
|
+
logger.info("Wrote after-insert hook output to #{output_path}")
|
|
91
|
+
end
|
|
92
|
+
|
|
93
|
+
def self.run_shell(path:, cli_options:, output_dir:, logger:)
|
|
94
|
+
env = {
|
|
95
|
+
'EXWIW_OUTPUT_DIR' => output_dir,
|
|
96
|
+
'EXWIW_CONFIG_DIR' => cli_options[:config_dir].to_s,
|
|
97
|
+
'EXWIW_DATABASE_ADAPTER' => cli_options[:database_adapter].to_s,
|
|
98
|
+
'EXWIW_DATABASE_HOST' => cli_options[:database_host].to_s,
|
|
99
|
+
'EXWIW_DATABASE_PORT' => cli_options[:database_port].to_s,
|
|
100
|
+
'EXWIW_DATABASE_USER' => cli_options[:database_user].to_s,
|
|
101
|
+
'EXWIW_DATABASE_NAME' => cli_options[:database_name].to_s,
|
|
102
|
+
'EXWIW_TARGET_TABLE' => cli_options[:target_table].to_s,
|
|
103
|
+
'EXWIW_IDS' => Array(cli_options[:ids]).join(','),
|
|
104
|
+
'EXWIW_OUTPUT_FORMAT' => cli_options[:output_format].to_s,
|
|
105
|
+
}
|
|
106
|
+
# DATABASE_PASSWORD は既存 ENV をそのまま受け継がせる (env hash で上書きしない)。
|
|
107
|
+
ok = system(env, path)
|
|
108
|
+
raise "after-insert shell hook failed: #{path}" unless ok
|
|
109
|
+
end
|
|
110
|
+
|
|
111
|
+
class Context
|
|
112
|
+
attr_reader :cli_options, :collected
|
|
113
|
+
|
|
114
|
+
def initialize(cli_options)
|
|
115
|
+
@cli_options = cli_options
|
|
116
|
+
@collected = []
|
|
117
|
+
end
|
|
118
|
+
|
|
119
|
+
# ERB 評価。MongoDB 向けに `insert_jsonl` の別名も提供する (出力ファイル名・拡張子は
|
|
120
|
+
# Runner 側で adapter.output_extension を元に決まるので、どちらを呼んでも同じバッファに溜まる)。
|
|
121
|
+
def insert_sql(template)
|
|
122
|
+
@collected << ERB.new(template, trim_mode: '-').result(binding)
|
|
123
|
+
end
|
|
124
|
+
alias_method :insert_jsonl, :insert_sql
|
|
125
|
+
end
|
|
126
|
+
end
|
|
127
|
+
end
|
|
128
|
+
```
|
|
129
|
+
|
|
130
|
+
ポイント:
|
|
131
|
+
- `instance_eval(File.read(path), path)` で、フックファイル内では `cli_options` と `insert_sql` が単純なメソッド呼び出しとして使える (DSL 風)。
|
|
132
|
+
- ERB 評価は `Context#insert_sql` の `binding` を使うため、ERB テンプレート内でも `cli_options.fetch(:ids)` が呼べる。
|
|
133
|
+
- `insert_sql` / `insert_jsonl` を複数回呼ぶと `@collected` に積まれ、最後に `"\n"` で連結して 1 ファイルに書く。
|
|
134
|
+
- 空出力なら書き出さない (idempotency に近い挙動)。
|
|
135
|
+
- shell 実行は `system(env, path)`。`path` は単独引数として渡すのでシェル展開されない (Shellwords 必要なし)。失敗時は exception で停止 → exit。
|
|
136
|
+
- ENV 名は `EXWIW_*` 接頭辞で名前空間を切る。`DATABASE_PASSWORD` は親プロセスの ENV を継承させる (フック側で読みたければ読める)。
|
|
137
|
+
|
|
138
|
+
### lib/exwiw.rb への require 追加
|
|
139
|
+
`require_relative "exwiw/after_insert_hook"` を `runner.rb` の require の前後あたりに追加。
|
|
140
|
+
|
|
141
|
+
### 使用例 (README に追記)
|
|
142
|
+
|
|
143
|
+
`hooks/seed_default_users.rb`:
|
|
144
|
+
```ruby
|
|
145
|
+
# cli_options[:ids] には --ids で渡された配列が入る
|
|
146
|
+
insert_sql <<~SQL
|
|
147
|
+
-- seed default users for tenants <%= cli_options.fetch(:ids).join(',') %>
|
|
148
|
+
<%- cli_options.fetch(:ids).each do |tenant_id| -%>
|
|
149
|
+
INSERT INTO users (tenant_id, email) VALUES (<%= tenant_id %>, 'default@example.com');
|
|
150
|
+
<%- end -%>
|
|
151
|
+
SQL
|
|
152
|
+
```
|
|
153
|
+
|
|
154
|
+
実行:
|
|
155
|
+
```
|
|
156
|
+
exwiw --adapter=mysql2 ... --target-table=shops --ids=1,2 \
|
|
157
|
+
--after-insert=hooks/seed_default_users.rb
|
|
158
|
+
```
|
|
159
|
+
結果: `dump/insert-{total+1}-after_insert.sql` が、テナント 1,2 用の INSERT を含めて出力される。
|
|
160
|
+
|
|
161
|
+
## Files to modify / add
|
|
162
|
+
|
|
163
|
+
| パス | 変更 |
|
|
164
|
+
|---|---|
|
|
165
|
+
| `lib/exwiw/cli.rb` | `--after-insert` parse / validate / `build_cli_options_hash` 追加、Runner 呼び出しへ伝搬 |
|
|
166
|
+
| `lib/exwiw/runner.rb` | `initialize` に `after_insert_path:`, `cli_options:` 追加、per-table ループ完了後にフック呼び出し |
|
|
167
|
+
| `lib/exwiw/after_insert_hook.rb` (新規) | `AfterInsertHook.run` + `Context` (DSL + ERB) |
|
|
168
|
+
| `lib/exwiw.rb` | `require_relative "exwiw/after_insert_hook"` 追加 |
|
|
169
|
+
| `README.md` | `--after-insert` の節を追加 (Ruby DSL 例 / shell hook 例 / 環境変数一覧) |
|
|
170
|
+
| `spec/runner_spec.rb` | Ruby フックで `insert-{N+1}-after_insert.sql` が書き出され、ERB で `cli_options.fetch(:ids)` が展開されることを assert |
|
|
171
|
+
| `spec/after_insert_hook_spec.rb` (新規) | `Context#insert_sql` の ERB 評価、複数回呼び出しが `"\n"` 連結されることを assert |
|
|
172
|
+
|
|
173
|
+
## Verification
|
|
174
|
+
|
|
175
|
+
1. **ユニットテスト**: `bundle exec rspec spec/after_insert_hook_spec.rb` — `Context#insert_sql` を直接叩いて、ERB が `cli_options.fetch(:ids)` を解決できること、複数回呼び出しが `\n` 連結されることを確認。
|
|
176
|
+
2. **統合テスト**: `bundle exec rspec spec/runner_spec.rb` — sqlite3 経由で実際に Runner を流し、tmp に書いたフック `.rb` を `--after-insert` 相当で渡し、`tmp/.../insert-{N+1}-after_insert.sql` が生成されることと、ファイル内に ERB 展開後の `--ids` の値が含まれることを assert。
|
|
177
|
+
3. **CLI E2E**: scenario の `test_with_sqlite3.sh` を一時的に編集して `--after-insert=` を付けた呼び出しを試し、出力ディレクトリに想定どおりのファイルが置かれることを目視確認。MongoDB は `--after-insert=hook.rb` で `insert_jsonl` を使ったときに `.jsonl` が出ることのみ smoke-test。
|
|
178
|
+
4. **エッジケース確認**:
|
|
179
|
+
- `--after-insert=missing.rb` でわかりやすいエラー終了。
|
|
180
|
+
- フック内で `insert_sql` を 1 度も呼ばなかったとき、ファイルは作られず info ログのみ。
|
|
181
|
+
- shell hook の non-zero exit code で Runner が落ちる。
|
|
182
|
+
- `--ids` 省略時に `cli_options.fetch(:ids)` が空配列を返す (`@ids = []` 初期値が保たれる)。
|
|
183
|
+
|
|
184
|
+
## 留意点 / 既知のリスク
|
|
185
|
+
|
|
186
|
+
- **任意コード実行**: `.rb` フックは `instance_eval` で実行される (= exwiw プロセスと同じ権限で動く)。ユーザ自身が用意した hook を渡す前提なので問題ないが、README に「信頼するソースのみ」と注意書きを入れる。
|
|
187
|
+
- **ファイル番号の衝突**: `delete-*` は逆順番号 (`total_size - idx`) を使うので、`insert-{total+1}-after_insert` と直接衝突はしない (`delete-001`...`delete-{total}` の範囲)。
|
|
188
|
+
- **MongoDB 対応の限界**: `insert_jsonl` は ERB 出力結果を `.jsonl` としてそのまま書き出す。1 行 1 ドキュメントの形に揃えるのはユーザ責任。`mongoimport` で流せる前提。
|
|
189
|
+
- **password の取り扱い**: shell hook の env に `DATABASE_PASSWORD` を明示的に詰めない (親プロセス ENV を継承させる)。プロセス一覧経由の漏えいを防ぐため、ENV を介すのは hash で `EXWIW_*` のみ。
|
|
@@ -0,0 +1,91 @@
|
|
|
1
|
+
# PostgreSQL COPY モードの SQL 妥当性を検証する scenario_test 追加
|
|
2
|
+
|
|
3
|
+
## Context
|
|
4
|
+
|
|
5
|
+
`exwiw` の PostgreSQL アダプターは `--output-format=copy` で `COPY ... FROM stdin;` 形式の出力に切り替えられる(`lib/exwiw/adapter/postgresql_adapter.rb:77-85`)。既存のテストでは:
|
|
6
|
+
|
|
7
|
+
- 単体テスト(`spec/adapter/postgresql_adapter_spec.rb:256-305`)が文字列フォーマットを検証
|
|
8
|
+
- ランナー統合テスト(`spec/runner_spec.rb:236-286`)がファイル構造を検証
|
|
9
|
+
|
|
10
|
+
しかし **生成された COPY-mode SQL を実際に `psql -f` で取り込めるかを検証する end-to-end テストが存在しない**。ユーザーは COPY モードで invalid な SQL が出ているのではと疑っており、それを実DBに対して検証したい。
|
|
11
|
+
|
|
12
|
+
既存の INSERT モードは `scenario/test_with_postgresql.sh` が `psql -f` での再取込まで含めて検証している。これに対応する COPY モード版が無い状態。
|
|
13
|
+
|
|
14
|
+
ゴール: COPY モード出力を実際に psql に食わせる E2E シナリオ + スナップショット回帰テストを追加し、潜在的な invalid SQL を表面化する。
|
|
15
|
+
|
|
16
|
+
## 変更ファイル
|
|
17
|
+
|
|
18
|
+
1. **新規** `scenario/test_with_postgresql_copy.sh` — E2E シェル
|
|
19
|
+
2. **修正** `spec/insert_output_snapshot_spec.rb` — COPY 用の SCENARIOS エントリと `snapshot_subdir` 対応
|
|
20
|
+
3. **修正** `.github/workflows/scenario.yml` — `with_postgres` ジョブに新ステップ
|
|
21
|
+
4. **新規** `spec/insert_output_snapshots/postgresql-copy/insert-*.sql` — `UPDATE_SNAPSHOTS=1` で自動生成
|
|
22
|
+
|
|
23
|
+
## 詳細
|
|
24
|
+
|
|
25
|
+
### 1. `scenario/test_with_postgresql_copy.sh`
|
|
26
|
+
|
|
27
|
+
`scenario/test_with_postgresql.sh` を雛形にして以下のみ差し替え:
|
|
28
|
+
|
|
29
|
+
- `FROM_DATABASE_NAME="exwiw_scenario_prod_db_copy"`
|
|
30
|
+
- `TO_DATABASE_NAME="exwiw_scenario_dev_db_copy"`(並列実行されても既存シナリオと衝突しない名前)
|
|
31
|
+
- `exe/exwiw` に `--output-format=copy` を追加
|
|
32
|
+
- `--output-dir=tmp/postgresql-copy` に変更
|
|
33
|
+
- `delete-*.sql` / `insert-*.sql` のループも `tmp/postgresql-copy/` を参照
|
|
34
|
+
|
|
35
|
+
`set -e` により、psql が COPY ブロックの構文/データエラーで終了したら即時失敗する。これがユーザーが疑う「invalid SQL」の検出ポイント。
|
|
36
|
+
|
|
37
|
+
末尾の検証(`INSERT INTO shops ... ` がオートインクリメントで通るか)はそのまま流用 — `to_copy_from_stdin` の後ろに付く `post_insert_sql`(sequence の setval)まで含めて検証される。
|
|
38
|
+
|
|
39
|
+
実行権限 `chmod +x` を付与(兄弟スクリプトに合わせる)。
|
|
40
|
+
|
|
41
|
+
### 2. `spec/insert_output_snapshot_spec.rb`
|
|
42
|
+
|
|
43
|
+
- 78 行目の `snapshot_dir` を `scenario[:snapshot_subdir] || scenario[:adapter]` に変更し、同一 adapter で複数シナリオを持てるようにする
|
|
44
|
+
- 76 行目の context ラベルに `output_format` がある場合のサフィックスを足して、rspec 出力で区別可能にする
|
|
45
|
+
- `SCENARIOS` 配列に以下を追加:
|
|
46
|
+
|
|
47
|
+
```ruby
|
|
48
|
+
{
|
|
49
|
+
adapter: "postgresql",
|
|
50
|
+
config_dir: "scenario/postgresql-schema",
|
|
51
|
+
output_format: "copy",
|
|
52
|
+
snapshot_subdir: "postgresql-copy",
|
|
53
|
+
connection: { adapter: "postgresql", database_name: "exwiw_test",
|
|
54
|
+
host: "127.0.0.1", port: 5432,
|
|
55
|
+
user: "postgres", password: "test_password" },
|
|
56
|
+
},
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
86 行目の `scenario.fetch(:output_format, "insert")` は既に存在するので追加変更不要。`insert-000-schema.sql` の pg_dump 正規化(21-25 行)も同 adapter なのでそのまま効く。
|
|
60
|
+
|
|
61
|
+
### 3. `.github/workflows/scenario.yml`
|
|
62
|
+
|
|
63
|
+
`with_postgres` ジョブの「Run exwiw (from clean target DB)」ステップ(115 行目)の後に追加:
|
|
64
|
+
|
|
65
|
+
```yaml
|
|
66
|
+
- name: Run exwiw (copy mode)
|
|
67
|
+
run: scenario/test_with_postgresql_copy.sh
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
`postgres:17-alpine` サービスと `postgresql-client-17` インストールは既存ステップで完了済みなので追加不要。
|
|
71
|
+
|
|
72
|
+
### 4. スナップショット生成
|
|
73
|
+
|
|
74
|
+
```
|
|
75
|
+
UPDATE_SNAPSHOTS=1 bundle exec rspec spec/insert_output_snapshot_spec.rb
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
`spec/insert_output_snapshots/postgresql-copy/` 配下に `insert-000-schema.sql` + `insert-001-shops.sql` ... `insert-007-transactions.sql` 相当が生成される。これを git に含める。
|
|
79
|
+
|
|
80
|
+
## 検証手順
|
|
81
|
+
|
|
82
|
+
1. ローカルで `docker compose up -d postgres` を起動
|
|
83
|
+
2. `bash scenario/test_with_postgresql_copy.sh` を実行 — exit 0 ならば COPY モード SQL は psql 経由で valid。non-zero なら invalid SQL が表面化(その時点で原因を特定して別途修正)
|
|
84
|
+
3. `UPDATE_SNAPSHOTS=1 bundle exec rspec spec/insert_output_snapshot_spec.rb` でスナップショットを生成
|
|
85
|
+
4. `bundle exec rspec spec/insert_output_snapshot_spec.rb` を `UPDATE_SNAPSHOTS` 無しで再実行し、全シナリオ(sqlite3 / mysql2 / postgresql / postgresql-copy / mongodb)が通ることを確認
|
|
86
|
+
5. CI 上で `with_postgres` ジョブの新ステップ `Run exwiw (copy mode)` が通る(または invalid SQL を検出する)ことを確認
|
|
87
|
+
|
|
88
|
+
## 想定される結果の分岐
|
|
89
|
+
|
|
90
|
+
- **テストが通った場合**: ユーザーの疑いは(少なくとも seed データの範囲では)杞憂。回帰テストとして残り、今後 COPY モード周りの改修で SQL を壊した時に早期検出できる
|
|
91
|
+
- **テストが落ちた場合**: 落ち方(psql のエラーメッセージ)から原因を特定。修正は本プランの範囲外として別タスクで対応する(ユーザーに報告 → 方針決定 → 実装)
|
|
@@ -97,6 +97,10 @@ module Exwiw
|
|
|
97
97
|
raise NotImplementedError, "MongodbAdapter does not support bulk delete"
|
|
98
98
|
end
|
|
99
99
|
|
|
100
|
+
def explain(_query)
|
|
101
|
+
raise NotImplementedError, "MongodbAdapter does not support explain yet"
|
|
102
|
+
end
|
|
103
|
+
|
|
100
104
|
def output_extension
|
|
101
105
|
'jsonl'
|
|
102
106
|
end
|
|
@@ -14,6 +14,17 @@ module Exwiw
|
|
|
14
14
|
connection.query(sql, cast: false, as: :array).to_a
|
|
15
15
|
end
|
|
16
16
|
|
|
17
|
+
def explain(query_ast)
|
|
18
|
+
sql = compile_ast(query_ast)
|
|
19
|
+
|
|
20
|
+
@logger.debug(" Executing EXPLAIN: \n#{sql}")
|
|
21
|
+
rows = connection.query("EXPLAIN #{sql}", cast: false).to_a
|
|
22
|
+
rows.each_with_index.flat_map do |row, i|
|
|
23
|
+
["*************************** #{i + 1}. row ***************************"] +
|
|
24
|
+
row.map { |k, v| "#{k}: #{v}" }
|
|
25
|
+
end.join("\n")
|
|
26
|
+
end
|
|
27
|
+
|
|
17
28
|
def dump_schema(ordered_tables, output_path)
|
|
18
29
|
require 'open3'
|
|
19
30
|
|
|
@@ -14,6 +14,13 @@ module Exwiw
|
|
|
14
14
|
connection.exec(sql).values
|
|
15
15
|
end
|
|
16
16
|
|
|
17
|
+
def explain(query_ast)
|
|
18
|
+
sql = compile_ast(query_ast)
|
|
19
|
+
|
|
20
|
+
@logger.debug(" Executing EXPLAIN: \n#{sql}")
|
|
21
|
+
connection.exec("EXPLAIN #{sql}").values.map(&:first).join("\n")
|
|
22
|
+
end
|
|
23
|
+
|
|
17
24
|
def dump_schema(ordered_tables, output_path)
|
|
18
25
|
require 'open3'
|
|
19
26
|
|
|
@@ -14,6 +14,14 @@ module Exwiw
|
|
|
14
14
|
connection.execute(sql)
|
|
15
15
|
end
|
|
16
16
|
|
|
17
|
+
def explain(query_ast)
|
|
18
|
+
sql = compile_ast(query_ast)
|
|
19
|
+
|
|
20
|
+
@logger.debug(" Executing EXPLAIN QUERY PLAN: \n#{sql}")
|
|
21
|
+
rows = connection.execute("EXPLAIN QUERY PLAN #{sql}")
|
|
22
|
+
rows.map { |row| row[3] }.join("\n")
|
|
23
|
+
end
|
|
24
|
+
|
|
17
25
|
def dump_schema(ordered_tables, output_path)
|
|
18
26
|
@logger.debug(" Reading schema from sqlite_master...")
|
|
19
27
|
target_names = ordered_tables.map(&:name)
|
data/lib/exwiw/adapter.rb
CHANGED
|
@@ -74,6 +74,13 @@ module Exwiw
|
|
|
74
74
|
def to_copy_from_stdin(_results, _table)
|
|
75
75
|
raise NotImplementedError, "COPY format is not supported by #{self.class.name}"
|
|
76
76
|
end
|
|
77
|
+
|
|
78
|
+
# Run the database-specific EXPLAIN for the given query and return the
|
|
79
|
+
# output as a single string for `explain` subcommand to print.
|
|
80
|
+
# SQL adapters override; MongodbAdapter currently raises.
|
|
81
|
+
def explain(_query_ast)
|
|
82
|
+
raise NotImplementedError, "#{self.class.name} does not implement #explain"
|
|
83
|
+
end
|
|
77
84
|
end
|
|
78
85
|
|
|
79
86
|
# @params [Exwiw::QueryAst] query_ast
|
|
@@ -0,0 +1,63 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require 'erb'
|
|
4
|
+
|
|
5
|
+
module Exwiw
|
|
6
|
+
class AfterInsertHook
|
|
7
|
+
def self.run(path:, cli_options:, output_dir:, next_idx:, output_extension:, logger:)
|
|
8
|
+
ext = File.extname(path)
|
|
9
|
+
idx_str = next_idx.to_s.rjust(3, '0')
|
|
10
|
+
output_path = File.join(output_dir, "insert-#{idx_str}-after_insert.#{output_extension}")
|
|
11
|
+
|
|
12
|
+
if ext == '.rb'
|
|
13
|
+
run_ruby(path: path, cli_options: cli_options, output_path: output_path, logger: logger)
|
|
14
|
+
else
|
|
15
|
+
run_shell(path: path, cli_options: cli_options, output_dir: output_dir, logger: logger)
|
|
16
|
+
end
|
|
17
|
+
end
|
|
18
|
+
|
|
19
|
+
def self.run_ruby(path:, cli_options:, output_path:, logger:)
|
|
20
|
+
ctx = Context.new(cli_options)
|
|
21
|
+
ctx.instance_eval(File.read(path), path)
|
|
22
|
+
content = ctx.collected.join("\n")
|
|
23
|
+
if content.empty?
|
|
24
|
+
logger.info("After-insert hook produced no output; skipping file write.")
|
|
25
|
+
return
|
|
26
|
+
end
|
|
27
|
+
File.write(output_path, content)
|
|
28
|
+
logger.info("Wrote after-insert hook output to #{output_path}")
|
|
29
|
+
end
|
|
30
|
+
|
|
31
|
+
def self.run_shell(path:, cli_options:, output_dir:, logger:)
|
|
32
|
+
env = {
|
|
33
|
+
'EXWIW_OUTPUT_DIR' => output_dir,
|
|
34
|
+
'EXWIW_CONFIG_DIR' => cli_options[:config_dir].to_s,
|
|
35
|
+
'EXWIW_DATABASE_ADAPTER' => cli_options[:database_adapter].to_s,
|
|
36
|
+
'EXWIW_DATABASE_HOST' => cli_options[:database_host].to_s,
|
|
37
|
+
'EXWIW_DATABASE_PORT' => cli_options[:database_port].to_s,
|
|
38
|
+
'EXWIW_DATABASE_USER' => cli_options[:database_user].to_s,
|
|
39
|
+
'EXWIW_DATABASE_NAME' => cli_options[:database_name].to_s,
|
|
40
|
+
'EXWIW_TARGET_TABLE' => cli_options[:target_table].to_s,
|
|
41
|
+
'EXWIW_IDS' => Array(cli_options[:ids]).join(','),
|
|
42
|
+
'EXWIW_OUTPUT_FORMAT' => cli_options[:output_format].to_s,
|
|
43
|
+
}
|
|
44
|
+
logger.info("Running after-insert shell hook: #{path}")
|
|
45
|
+
ok = system(env, path)
|
|
46
|
+
raise "after-insert shell hook failed: #{path}" unless ok
|
|
47
|
+
end
|
|
48
|
+
|
|
49
|
+
class Context
|
|
50
|
+
attr_reader :cli_options, :collected
|
|
51
|
+
|
|
52
|
+
def initialize(cli_options)
|
|
53
|
+
@cli_options = cli_options
|
|
54
|
+
@collected = []
|
|
55
|
+
end
|
|
56
|
+
|
|
57
|
+
def insert_sql(template)
|
|
58
|
+
@collected << ERB.new(template, trim_mode: '-').result(binding)
|
|
59
|
+
end
|
|
60
|
+
alias_method :insert_jsonl, :insert_sql
|
|
61
|
+
end
|
|
62
|
+
end
|
|
63
|
+
end
|
data/lib/exwiw/cli.rb
CHANGED
|
@@ -10,25 +10,37 @@ require 'exwiw'
|
|
|
10
10
|
|
|
11
11
|
module Exwiw
|
|
12
12
|
class CLI
|
|
13
|
+
KNOWN_SUBCOMMANDS = %w[dump explain].freeze
|
|
14
|
+
|
|
13
15
|
def self.start(argv)
|
|
14
16
|
new(argv).run
|
|
15
17
|
end
|
|
16
18
|
|
|
17
19
|
def initialize(argv)
|
|
18
20
|
@argv = argv.dup
|
|
19
|
-
|
|
21
|
+
|
|
22
|
+
@subcommand =
|
|
23
|
+
if !@argv.empty? && !@argv.first.start_with?("-") && KNOWN_SUBCOMMANDS.include?(@argv.first)
|
|
24
|
+
@argv.shift
|
|
25
|
+
else
|
|
26
|
+
"dump"
|
|
27
|
+
end
|
|
28
|
+
|
|
29
|
+
@help = @argv.empty?
|
|
20
30
|
|
|
21
31
|
@database_host = nil
|
|
22
32
|
@database_port = nil
|
|
23
33
|
@database_user = nil
|
|
24
34
|
@database_password = ENV["DATABASE_PASSWORD"]
|
|
25
|
-
@output_dir =
|
|
35
|
+
@output_dir = nil
|
|
26
36
|
@config_dir = nil
|
|
27
37
|
@database_adapter = nil
|
|
28
38
|
@database_name = nil
|
|
29
39
|
@target_table_name = nil
|
|
30
40
|
@ids = []
|
|
31
|
-
@output_format =
|
|
41
|
+
@output_format = nil
|
|
42
|
+
@insert_only = nil
|
|
43
|
+
@after_insert_hook_path = nil
|
|
32
44
|
@log_level = :info
|
|
33
45
|
|
|
34
46
|
parser.parse!(@argv)
|
|
@@ -37,37 +49,56 @@ module Exwiw
|
|
|
37
49
|
def run
|
|
38
50
|
if @help
|
|
39
51
|
puts parser.help
|
|
40
|
-
|
|
41
|
-
|
|
52
|
+
return
|
|
53
|
+
end
|
|
42
54
|
|
|
43
|
-
|
|
44
|
-
adapter: @database_adapter,
|
|
45
|
-
host: @database_host,
|
|
46
|
-
port: @database_port,
|
|
47
|
-
user: @database_user,
|
|
48
|
-
password: @database_password,
|
|
49
|
-
database_name: @database_name,
|
|
50
|
-
)
|
|
55
|
+
validate_options!
|
|
51
56
|
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
|
|
57
|
+
connection_config = ConnectionConfig.new(
|
|
58
|
+
adapter: @database_adapter,
|
|
59
|
+
host: @database_host,
|
|
60
|
+
port: @database_port,
|
|
61
|
+
user: @database_user,
|
|
62
|
+
password: @database_password,
|
|
63
|
+
database_name: @database_name,
|
|
64
|
+
)
|
|
56
65
|
|
|
57
|
-
|
|
66
|
+
dump_target = DumpTarget.new(
|
|
67
|
+
table_name: @target_table_name,
|
|
68
|
+
ids: @ids,
|
|
69
|
+
)
|
|
70
|
+
|
|
71
|
+
logger = build_logger
|
|
58
72
|
|
|
73
|
+
case @subcommand
|
|
74
|
+
when "dump"
|
|
59
75
|
Runner.new(
|
|
60
76
|
connection_config: connection_config,
|
|
61
77
|
output_dir: @output_dir,
|
|
62
78
|
config_dir: @config_dir,
|
|
63
79
|
dump_target: dump_target,
|
|
64
80
|
output_format: @output_format,
|
|
81
|
+
insert_only: @insert_only,
|
|
82
|
+
after_insert_hook_path: @after_insert_hook_path,
|
|
83
|
+
cli_options: build_cli_options_hash,
|
|
84
|
+
logger: logger,
|
|
85
|
+
).run
|
|
86
|
+
when "explain"
|
|
87
|
+
ExplainRunner.new(
|
|
88
|
+
connection_config: connection_config,
|
|
89
|
+
config_dir: @config_dir,
|
|
90
|
+
dump_target: dump_target,
|
|
65
91
|
logger: logger,
|
|
92
|
+
io: $stdout,
|
|
66
93
|
).run
|
|
67
94
|
end
|
|
68
95
|
end
|
|
69
96
|
|
|
70
97
|
private def validate_options!
|
|
98
|
+
if @subcommand == "explain"
|
|
99
|
+
validate_explain_only!
|
|
100
|
+
end
|
|
101
|
+
|
|
71
102
|
if @database_adapter != "sqlite3"
|
|
72
103
|
required_options = {
|
|
73
104
|
"Target database host" => @database_host,
|
|
@@ -94,15 +125,21 @@ module Exwiw
|
|
|
94
125
|
exit 1
|
|
95
126
|
end
|
|
96
127
|
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
|
|
100
|
-
|
|
101
|
-
end
|
|
128
|
+
if @subcommand == "dump"
|
|
129
|
+
@output_dir ||= "dump"
|
|
130
|
+
@output_format ||= "insert"
|
|
131
|
+
@insert_only = @insert_only ? true : false
|
|
102
132
|
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
133
|
+
valid_output_formats = ["insert", "copy"]
|
|
134
|
+
unless valid_output_formats.include?(@output_format)
|
|
135
|
+
$stderr.puts "Invalid output format '#{@output_format}'. Available options are: #{valid_output_formats.join(', ')}"
|
|
136
|
+
exit 1
|
|
137
|
+
end
|
|
138
|
+
|
|
139
|
+
if @output_format == "copy" && @database_adapter != "postgresql"
|
|
140
|
+
$stderr.puts "--output-format=copy is only supported with the postgresql adapter"
|
|
141
|
+
exit 1
|
|
142
|
+
end
|
|
106
143
|
end
|
|
107
144
|
|
|
108
145
|
if @config_dir.nil?
|
|
@@ -129,6 +166,56 @@ module Exwiw
|
|
|
129
166
|
$stderr.puts "--target-table is required when --ids is specified"
|
|
130
167
|
exit 1
|
|
131
168
|
end
|
|
169
|
+
|
|
170
|
+
if @after_insert_hook_path
|
|
171
|
+
unless File.file?(@after_insert_hook_path)
|
|
172
|
+
$stderr.puts "--after-insert-hook file not found: #{@after_insert_hook_path}"
|
|
173
|
+
exit 1
|
|
174
|
+
end
|
|
175
|
+
|
|
176
|
+
ext = File.extname(@after_insert_hook_path)
|
|
177
|
+
if ext != '.rb' && !File.executable?(@after_insert_hook_path)
|
|
178
|
+
$stderr.puts "--after-insert-hook must be a .rb file or an executable script: #{@after_insert_hook_path}"
|
|
179
|
+
exit 1
|
|
180
|
+
end
|
|
181
|
+
end
|
|
182
|
+
end
|
|
183
|
+
|
|
184
|
+
private def validate_explain_only!
|
|
185
|
+
if @database_adapter == "mongodb"
|
|
186
|
+
$stderr.puts "mongodb adapter is not yet supported by 'explain' subcommand"
|
|
187
|
+
exit 1
|
|
188
|
+
end
|
|
189
|
+
|
|
190
|
+
rejected = []
|
|
191
|
+
rejected << "--output-dir" unless @output_dir.nil?
|
|
192
|
+
rejected << "--output-format" unless @output_format.nil?
|
|
193
|
+
rejected << "--insert-only" unless @insert_only.nil?
|
|
194
|
+
rejected << "--after-insert-hook" unless @after_insert_hook_path.nil?
|
|
195
|
+
|
|
196
|
+
unless rejected.empty?
|
|
197
|
+
$stderr.puts "The following options are not applicable in 'explain' subcommand: #{rejected.join(', ')}"
|
|
198
|
+
exit 1
|
|
199
|
+
end
|
|
200
|
+
end
|
|
201
|
+
|
|
202
|
+
private def build_cli_options_hash
|
|
203
|
+
{
|
|
204
|
+
database_host: @database_host,
|
|
205
|
+
database_port: @database_port,
|
|
206
|
+
database_user: @database_user,
|
|
207
|
+
database_password: @database_password,
|
|
208
|
+
output_dir: @output_dir,
|
|
209
|
+
config_dir: @config_dir,
|
|
210
|
+
database_adapter: @database_adapter,
|
|
211
|
+
database_name: @database_name,
|
|
212
|
+
target_table: @target_table_name,
|
|
213
|
+
ids: @ids.dup.freeze,
|
|
214
|
+
output_format: @output_format,
|
|
215
|
+
insert_only: @insert_only,
|
|
216
|
+
log_level: @log_level,
|
|
217
|
+
after_insert_hook: @after_insert_hook_path,
|
|
218
|
+
}.freeze
|
|
132
219
|
end
|
|
133
220
|
|
|
134
221
|
private def build_logger
|
|
@@ -148,13 +235,22 @@ module Exwiw
|
|
|
148
235
|
|
|
149
236
|
private def parser
|
|
150
237
|
@parser ||= OptionParser.new do |opts|
|
|
151
|
-
opts.banner =
|
|
238
|
+
opts.banner = <<~BANNER
|
|
239
|
+
exwiw #{Exwiw::VERSION}
|
|
240
|
+
|
|
241
|
+
Usage: exwiw [SUBCOMMAND] [options]
|
|
242
|
+
|
|
243
|
+
Subcommands:
|
|
244
|
+
dump Generate INSERT/COPY SQL files (default when omitted).
|
|
245
|
+
explain Print EXPLAIN output for each extraction query to stdout.
|
|
246
|
+
(not yet supported for the mongodb adapter)
|
|
247
|
+
BANNER
|
|
152
248
|
opts.version = Exwiw::VERSION
|
|
153
249
|
|
|
154
250
|
opts.on("-h", "--host=HOST", "Target database host") { |v| @database_host = v }
|
|
155
251
|
opts.on("-p", "--port=PORT", "Target database port") { |v| @database_port = v }
|
|
156
252
|
opts.on("-u", "--user=USERNAME", "Target database user") { |v| @database_user = v }
|
|
157
|
-
opts.on("-o", "--output-dir=[DUMP_DIR_PATH]", "Output file path. default is dump/") do |v|
|
|
253
|
+
opts.on("-o", "--output-dir=[DUMP_DIR_PATH]", "Output file path. default is dump/ (dump subcommand only)") do |v|
|
|
158
254
|
v = v.end_with?("/") ? v[0..-2] : v
|
|
159
255
|
@output_dir = File.expand_path(v)
|
|
160
256
|
end
|
|
@@ -166,7 +262,11 @@ module Exwiw
|
|
|
166
262
|
opts.on("--database=DATABASE", "Target database name") { |v| @database_name = v }
|
|
167
263
|
opts.on("--target-table=[TABLE]", "Target table for extraction. If omitted, dump all tables.") { |v| @target_table_name = v }
|
|
168
264
|
opts.on("--ids=[IDS]", "Comma-separated list of identifiers. Required when --target-table is given.") { |v| @ids = v.split(',') }
|
|
169
|
-
opts.on("--output-format=[FORMAT]", "Output format: insert (default) or copy (PostgreSQL only)") { |v| @output_format = v }
|
|
265
|
+
opts.on("--output-format=[FORMAT]", "Output format: insert (default) or copy (PostgreSQL only, dump subcommand only)") { |v| @output_format = v }
|
|
266
|
+
opts.on("--insert-only", "Do not generate DELETE SQL files (dump subcommand only)") { @insert_only = true }
|
|
267
|
+
opts.on("--after-insert-hook=PATH", "Path to a .rb or .sh post-processing hook executed after all insert/delete files are written (dump subcommand only)") do |v|
|
|
268
|
+
@after_insert_hook_path = File.expand_path(v)
|
|
269
|
+
end
|
|
170
270
|
opts.on("--log-level=LEVEL", "Log level (debug, info). default is info") { |v| @log_level = v.to_sym }
|
|
171
271
|
|
|
172
272
|
opts.on("--help", "Print this help") do
|
|
@@ -0,0 +1,83 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module Exwiw
|
|
4
|
+
class ExplainRunner
|
|
5
|
+
def initialize(
|
|
6
|
+
connection_config:,
|
|
7
|
+
config_dir:,
|
|
8
|
+
dump_target:,
|
|
9
|
+
logger:,
|
|
10
|
+
io: $stdout
|
|
11
|
+
)
|
|
12
|
+
@connection_config = connection_config
|
|
13
|
+
@config_dir = config_dir
|
|
14
|
+
@dump_target = dump_target
|
|
15
|
+
@logger = logger
|
|
16
|
+
@io = io
|
|
17
|
+
end
|
|
18
|
+
|
|
19
|
+
def run
|
|
20
|
+
adapter = Adapter.build(@connection_config, @logger)
|
|
21
|
+
configs = load_table_config(adapter.class.table_config_class)
|
|
22
|
+
configs = reject_and_validate_skipped(configs)
|
|
23
|
+
|
|
24
|
+
table_by_name = configs.each_with_object({}) { |config, hash| hash[config.name] = config }
|
|
25
|
+
|
|
26
|
+
target = table_by_name[@dump_target.table_name]
|
|
27
|
+
adapter.validate_as_dump_target!(target) if target
|
|
28
|
+
|
|
29
|
+
@logger.debug("Determining table processing order...")
|
|
30
|
+
ordered_table_names = DetermineTableProcessingOrder.run(configs.select { |c| adapter.dumpable?(c) })
|
|
31
|
+
|
|
32
|
+
total_size = ordered_table_names.size
|
|
33
|
+
ordered_table_names.each_with_index do |table_name, idx|
|
|
34
|
+
@logger.debug("Explaining '#{table_name}'... (#{idx + 1}/#{total_size})")
|
|
35
|
+
table = table_by_name.fetch(table_name)
|
|
36
|
+
|
|
37
|
+
query_ast = adapter.build_query(table, @dump_target, table_by_name)
|
|
38
|
+
sql = adapter.compile_ast(query_ast)
|
|
39
|
+
explain_text = adapter.explain(query_ast)
|
|
40
|
+
|
|
41
|
+
@io.puts "-- [#{idx + 1}/#{total_size}] #{table_name}"
|
|
42
|
+
@io.puts sql
|
|
43
|
+
@io.puts
|
|
44
|
+
@io.puts "-- EXPLAIN:"
|
|
45
|
+
@io.puts explain_text
|
|
46
|
+
@io.puts
|
|
47
|
+
end
|
|
48
|
+
end
|
|
49
|
+
|
|
50
|
+
private def load_table_config(klass)
|
|
51
|
+
Dir[File.join(@config_dir, "*.json")].map do |file|
|
|
52
|
+
json = JSON.parse(File.read(file))
|
|
53
|
+
klass.from(json)
|
|
54
|
+
end
|
|
55
|
+
end
|
|
56
|
+
|
|
57
|
+
private def reject_and_validate_skipped(configs)
|
|
58
|
+
skipped_names = configs.select { |c| c.skip }.map(&:name).to_set
|
|
59
|
+
return configs if skipped_names.empty?
|
|
60
|
+
|
|
61
|
+
configs.each do |config|
|
|
62
|
+
next if config.skip
|
|
63
|
+
next unless config.respond_to?(:belongs_tos)
|
|
64
|
+
|
|
65
|
+
dangling = config.belongs_tos.select { |rel| skipped_names.include?(rel.table_name) }
|
|
66
|
+
next if dangling.empty?
|
|
67
|
+
|
|
68
|
+
raise ArgumentError,
|
|
69
|
+
"Table '#{config.name}' has belongs_to references to skipped table(s): " \
|
|
70
|
+
"#{dangling.map(&:table_name).join(', ')}. " \
|
|
71
|
+
"Remove the belongs_to entries or unset `skip` on the referenced table."
|
|
72
|
+
end
|
|
73
|
+
|
|
74
|
+
if @dump_target.table_name && skipped_names.include?(@dump_target.table_name)
|
|
75
|
+
raise ArgumentError,
|
|
76
|
+
"--target-table '#{@dump_target.table_name}' is marked skip:true and cannot be used as a dump target."
|
|
77
|
+
end
|
|
78
|
+
|
|
79
|
+
skipped_names.each { |n| @logger.info("Skipping table '#{n}' (skip:true)") }
|
|
80
|
+
configs.reject { |c| c.skip }
|
|
81
|
+
end
|
|
82
|
+
end
|
|
83
|
+
end
|
|
@@ -13,6 +13,7 @@ module Exwiw
|
|
|
13
13
|
attribute :belongs_tos, array(BelongsTo)
|
|
14
14
|
attribute :fields, array(MongodbField)
|
|
15
15
|
attribute :bulk_insert_chunk_size, optional(Integer), skip_serializing_if_nil: true
|
|
16
|
+
attribute :skip, Serdes::OptionalType.new(Serdes::ConcreteType.new(Boolean)), skip_serializing_if_nil: true
|
|
16
17
|
|
|
17
18
|
# Marks this config as physically embedded inside another collection's
|
|
18
19
|
# documents. When set, this config is not processed as a standalone dump
|
data/lib/exwiw/runner.rb
CHANGED
|
@@ -10,13 +10,19 @@ module Exwiw
|
|
|
10
10
|
config_dir:,
|
|
11
11
|
dump_target:,
|
|
12
12
|
logger:,
|
|
13
|
-
output_format: 'insert'
|
|
13
|
+
output_format: 'insert',
|
|
14
|
+
insert_only: false,
|
|
15
|
+
after_insert_hook_path: nil,
|
|
16
|
+
cli_options: {}
|
|
14
17
|
)
|
|
15
18
|
@connection_config = connection_config
|
|
16
19
|
@output_dir = output_dir
|
|
17
20
|
@config_dir = config_dir
|
|
18
21
|
@dump_target = dump_target
|
|
19
22
|
@output_format = output_format
|
|
23
|
+
@insert_only = insert_only
|
|
24
|
+
@after_insert_hook_path = after_insert_hook_path
|
|
25
|
+
@cli_options = cli_options
|
|
20
26
|
@logger = logger
|
|
21
27
|
end
|
|
22
28
|
|
|
@@ -24,6 +30,8 @@ module Exwiw
|
|
|
24
30
|
adapter = Adapter.build(@connection_config, @logger)
|
|
25
31
|
configs = load_table_config(adapter.class.table_config_class)
|
|
26
32
|
|
|
33
|
+
configs = reject_and_validate_skipped(configs)
|
|
34
|
+
|
|
27
35
|
table_by_name = configs.each_with_object({}) { |config, hash| hash[config.name] = config }
|
|
28
36
|
|
|
29
37
|
target = table_by_name[@dump_target.table_name]
|
|
@@ -80,7 +88,7 @@ module Exwiw
|
|
|
80
88
|
end
|
|
81
89
|
end
|
|
82
90
|
|
|
83
|
-
if adapter.supports_bulk_delete?
|
|
91
|
+
if adapter.supports_bulk_delete? && !@insert_only
|
|
84
92
|
@logger.debug(" Generate DELETE statement...")
|
|
85
93
|
delete_sql = adapter.to_bulk_delete(query_ast, table)
|
|
86
94
|
if @logger.debug?
|
|
@@ -94,6 +102,18 @@ module Exwiw
|
|
|
94
102
|
end
|
|
95
103
|
end
|
|
96
104
|
end
|
|
105
|
+
|
|
106
|
+
if @after_insert_hook_path
|
|
107
|
+
@logger.info("Running after-insert hook: #{@after_insert_hook_path}")
|
|
108
|
+
AfterInsertHook.run(
|
|
109
|
+
path: @after_insert_hook_path,
|
|
110
|
+
cli_options: @cli_options,
|
|
111
|
+
output_dir: @output_dir,
|
|
112
|
+
next_idx: total_size + 1,
|
|
113
|
+
output_extension: adapter.output_extension,
|
|
114
|
+
logger: @logger,
|
|
115
|
+
)
|
|
116
|
+
end
|
|
97
117
|
end
|
|
98
118
|
|
|
99
119
|
private def load_table_config(klass)
|
|
@@ -102,5 +122,31 @@ module Exwiw
|
|
|
102
122
|
klass.from(json)
|
|
103
123
|
end
|
|
104
124
|
end
|
|
125
|
+
|
|
126
|
+
private def reject_and_validate_skipped(configs)
|
|
127
|
+
skipped_names = configs.select { |c| c.skip }.map(&:name).to_set
|
|
128
|
+
return configs if skipped_names.empty?
|
|
129
|
+
|
|
130
|
+
configs.each do |config|
|
|
131
|
+
next if config.skip
|
|
132
|
+
next unless config.respond_to?(:belongs_tos)
|
|
133
|
+
|
|
134
|
+
dangling = config.belongs_tos.select { |rel| skipped_names.include?(rel.table_name) }
|
|
135
|
+
next if dangling.empty?
|
|
136
|
+
|
|
137
|
+
raise ArgumentError,
|
|
138
|
+
"Table '#{config.name}' has belongs_to references to skipped table(s): " \
|
|
139
|
+
"#{dangling.map(&:table_name).join(', ')}. " \
|
|
140
|
+
"Remove the belongs_to entries or unset `skip` on the referenced table."
|
|
141
|
+
end
|
|
142
|
+
|
|
143
|
+
if @dump_target.table_name && skipped_names.include?(@dump_target.table_name)
|
|
144
|
+
raise ArgumentError,
|
|
145
|
+
"--target-table '#{@dump_target.table_name}' is marked skip:true and cannot be used as a dump target."
|
|
146
|
+
end
|
|
147
|
+
|
|
148
|
+
skipped_names.each { |n| @logger.info("Skipping table '#{n}' (skip:true)") }
|
|
149
|
+
configs.reject { |c| c.skip }
|
|
150
|
+
end
|
|
105
151
|
end
|
|
106
152
|
end
|
data/lib/exwiw/table_config.rb
CHANGED
|
@@ -10,6 +10,7 @@ module Exwiw
|
|
|
10
10
|
attribute :belongs_tos, array(BelongsTo)
|
|
11
11
|
attribute :columns, array(TableColumn)
|
|
12
12
|
attribute :bulk_insert_chunk_size, optional(Integer), skip_serializing_if_nil: true
|
|
13
|
+
attribute :skip, Serdes::OptionalType.new(Serdes::ConcreteType.new(Boolean)), skip_serializing_if_nil: true
|
|
13
14
|
|
|
14
15
|
def self.from_symbol_keys(hash)
|
|
15
16
|
from(JSON.parse(hash.to_json))
|
|
@@ -76,6 +77,7 @@ module Exwiw
|
|
|
76
77
|
merged_table.filter = filter
|
|
77
78
|
merged_table.belongs_tos = passed_table.belongs_tos
|
|
78
79
|
merged_table.bulk_insert_chunk_size = passed_table.bulk_insert_chunk_size
|
|
80
|
+
merged_table.skip = skip
|
|
79
81
|
|
|
80
82
|
receiver_column_by_name = columns.each_with_object({}) { |column, hash| hash[column.name] = column }
|
|
81
83
|
|
data/lib/exwiw/version.rb
CHANGED
data/lib/exwiw.rb
CHANGED
|
@@ -21,7 +21,9 @@ require_relative "exwiw/determine_table_processing_order"
|
|
|
21
21
|
require_relative "exwiw/mongo_query"
|
|
22
22
|
require_relative "exwiw/query_ast"
|
|
23
23
|
require_relative "exwiw/query_ast_builder"
|
|
24
|
+
require_relative "exwiw/after_insert_hook"
|
|
24
25
|
require_relative "exwiw/runner"
|
|
26
|
+
require_relative "exwiw/explain_runner"
|
|
25
27
|
require_relative "exwiw/schema_generator"
|
|
26
28
|
|
|
27
29
|
begin
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: exwiw
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.1
|
|
4
|
+
version: 0.2.1
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Shia
|
|
@@ -37,6 +37,8 @@ files:
|
|
|
37
37
|
- README.md
|
|
38
38
|
- docs/plans/2026-05-15-insert-000-schema-file.md
|
|
39
39
|
- docs/plans/2026-05-16-mongodb-from-clean-scenario.md
|
|
40
|
+
- docs/plans/2026-05-22-after-insert-hook.md
|
|
41
|
+
- docs/plans/2026-05-22-postgres-copy-mode-scenario-test.md
|
|
40
42
|
- exe/exwiw
|
|
41
43
|
- lib/exwiw.rb
|
|
42
44
|
- lib/exwiw/adapter.rb
|
|
@@ -44,11 +46,13 @@ files:
|
|
|
44
46
|
- lib/exwiw/adapter/mysql2_adapter.rb
|
|
45
47
|
- lib/exwiw/adapter/postgresql_adapter.rb
|
|
46
48
|
- lib/exwiw/adapter/sqlite3_adapter.rb
|
|
49
|
+
- lib/exwiw/after_insert_hook.rb
|
|
47
50
|
- lib/exwiw/belongs_to.rb
|
|
48
51
|
- lib/exwiw/cli.rb
|
|
49
52
|
- lib/exwiw/ddl_postprocessor.rb
|
|
50
53
|
- lib/exwiw/determine_table_processing_order.rb
|
|
51
54
|
- lib/exwiw/embedded_in.rb
|
|
55
|
+
- lib/exwiw/explain_runner.rb
|
|
52
56
|
- lib/exwiw/mongo_query.rb
|
|
53
57
|
- lib/exwiw/mongodb_collection_config.rb
|
|
54
58
|
- lib/exwiw/mongodb_field.rb
|