exwiw 0.1.9 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: b48628d0a7599b151f957d6a40cd5a23cc3befa5007a86c550c834878b9893cc
4
- data.tar.gz: 38fe41ada0a3e0a8f358bd60c3a28a7b7637e91bd2066d2e2a97e5542716540a
3
+ metadata.gz: 4855b3fc49afc6cc69606579eb15c0cf430c4123024ec8e9b26a5215989292d3
4
+ data.tar.gz: 1ee99545e00cb43292c59b918dc24d3b4f764427f8708fca773c0b507f6c1e2d
5
5
  SHA512:
6
- metadata.gz: 4105dbad3eb0291b841e7ebf152776913e19d817dbf3a9901a6838d7f16113c4fbd30887b88015fed735c0ec4b82c03367fafed81f2f6c7669c8f091da9b4041
7
- data.tar.gz: 28a6383dbc953b93772f46adaf488cb4f0e826a0642859700c48536de46bde633a55780d8baff588da3e81dd6c4534d2fedd355424f7b68a70082c462eaf59aa
6
+ metadata.gz: fcbe518ae634e294bd48e61088dd4bd0f1012b317b08d921bcbd7ee95e4e595c69c9df59ad18593dbcee35b017c1fa6a198102f27757183b804218920ff64cbb
7
+ data.tar.gz: 48475293bc58a6f32ec9edc68c3bdc0bdd1427233ec4cd2366fb656e5fdd1e050a260056fc69e73c5735eeb8cb6b457ef7244a13589946430f138359cc050911
data/CHANGELOG.md CHANGED
@@ -2,6 +2,20 @@
2
2
 
3
3
  ## [Unreleased]
4
4
 
5
+ ## [0.2.1] - 2026-05-23
6
+
7
+ ### Added
8
+
9
+ - `skip: true` table config attribute to explicitly exclude a table from the dump. Skipped tables produce no schema entry, no `insert-*` file, and no `delete-*` file. Using a skipped table as `--target-table`, or having another non-skipped table reference it via `belongs_to`, raises `ArgumentError` on load. Available for both SQL adapters (`TableConfig`) and the MongoDB adapter (`MongodbCollectionConfig`). ([#26](https://github.com/heyinc/exwiw/pull/26))
10
+ - `dump` / `explain` subcommands. `dump` is the default and preserves the existing behavior when no subcommand is given. `explain` prints the compiled SQL and its `EXPLAIN` output (estimate-only — `EXPLAIN QUERY PLAN` on SQLite) for each extraction query to stdout without executing the SELECTs. Supported for `mysql2`, `postgresql`, and `sqlite3`; `mongodb` is not yet supported. ([#28](https://github.com/heyinc/exwiw/pull/28))
11
+
12
+ ## [0.2.0] - 2026-05-22
13
+
14
+ ### Added
15
+
16
+ - `--insert-only` CLI flag to skip generating `delete-*.sql` files. ([#23](https://github.com/heyinc/exwiw/pull/23))
17
+ - `--after-insert-hook` CLI flag to run a hook after per-table insert/delete files are generated. `.rb` hooks evaluate `insert_sql` DSL via ERB and write the result to `insert-{N+1}-after_insert.{ext}`; other executables run as a child process with `EXWIW_*` environment variables for pure side-effect hooks. ([#24](https://github.com/heyinc/exwiw/pull/24))
18
+
5
19
  ## [0.1.9] - 2026-05-21
6
20
 
7
21
  Added
data/README.md CHANGED
@@ -44,7 +44,12 @@ gem install exwiw
44
44
 
45
45
  ## Usage
46
46
 
47
- ### Command
47
+ exwiw has two subcommands:
48
+
49
+ - `dump` (default) — generate INSERT/COPY SQL files. This is the existing behavior; if the subcommand is omitted, `dump` is assumed for backwards compatibility.
50
+ - `explain` — print the compiled SQL and its `EXPLAIN` output for each query that `dump` would run, without executing the SELECTs.
51
+
52
+ ### `exwiw dump`
48
53
 
49
54
  ```bash
50
55
  # dump & masking all records from database to dump.sql based on schema.json
@@ -92,6 +97,22 @@ you need to delete the records before importing the dump,
92
97
  This sql will delete "all" related records to the extract targets.
93
98
  idx meaning is the same as insert sql.
94
99
 
100
+ ### `exwiw explain`
101
+
102
+ Print the compiled SQL and its `EXPLAIN` output (estimate-only; `EXPLAIN QUERY PLAN` on SQLite) for each query that `dump` would run, to stdout. No SELECT is executed. Supported for `mysql2`, `postgresql`, and `sqlite3`. The `mongodb` adapter is not yet supported.
103
+
104
+ ```bash
105
+ # preview the queries exwiw would run, without executing the SELECTs
106
+ exwiw explain \
107
+ --adapter=postgresql \
108
+ --host=localhost --port=5432 --user=reader \
109
+ --database=app_production \
110
+ --config-dir=exwiw \
111
+ --target-table=shops --ids=1
112
+ ```
113
+
114
+ The `--output-dir`, `--output-format`, `--insert-only`, and `--after-insert-hook` options are dump-specific and rejected when used with `explain`.
115
+
95
116
  ### Generator
96
117
 
97
118
  The config generator is provided as a Rake task.
@@ -136,21 +157,7 @@ This is an example of the one table schema:
136
157
 
137
158
  ### Output format
138
159
 
139
- By default, exwiw generates `INSERT` statements. For PostgreSQL, you can use `--output-format=copy` to generate `COPY FROM stdin` format instead, which is significantly faster for bulk loading:
140
-
141
- ```bash
142
- exwiw \
143
- --adapter=postgresql \
144
- --host=localhost \
145
- --port=5432 \
146
- --user=reader \
147
- --database=app_production \
148
- --config-dir=exwiw \
149
- --target-table=shops \
150
- --ids=1 \
151
- --output-dir=dump \
152
- --output-format=copy
153
- ```
160
+ By default, exwiw generates `INSERT` statements. For PostgreSQL, you can pass `--output-format=copy` to generate `COPY FROM stdin` format instead, which is significantly faster for bulk loading.
154
161
 
155
162
  The generated file uses tab-separated values with PostgreSQL's text-format escaping (`\N` for NULL, `\\` for backslash, etc.). Import with `psql`:
156
163
 
@@ -160,6 +167,60 @@ psql -d app_dev -f dump/insert-001-shops.sql
160
167
 
161
168
  `--output-format=copy` is only supported with the `postgresql` adapter.
162
169
 
170
+ ### Skip DELETE SQL output
171
+
172
+ By default, exwiw generates `delete-*.sql` files alongside the `insert-*.sql` files so that an existing dataset can be cleared before re-inserting. Pass `--insert-only` when you only need the insert files.
173
+
174
+ ### After-insert hook
175
+
176
+ `--after-insert-hook=PATH` runs a post-processing hook **after** all per-table insert/delete files have been written. The hook can be either a Ruby file (`.rb`) or any executable script (e.g. `.sh`).
177
+
178
+ **Ruby hook (`.rb`)**: provides a tiny DSL with two builtins:
179
+
180
+ - `cli_options` — Hash of all parsed CLI options (e.g. `cli_options.fetch(:ids)` returns the `--ids` array).
181
+ - `insert_sql(template)` — appends an ERB-rendered string to a buffer. After the hook finishes, the buffer is concatenated and written to `insert-{N+1}-after_insert.{ext}` where `{N+1}` is one past the last per-table insert file. For the MongoDB adapter the equivalent alias `insert_jsonl(template)` is available; output goes to `insert-{N+1}-after_insert.jsonl`. Multiple `insert_sql` calls in a single hook are joined with `"\n"` into the same file. If no `insert_sql` call is made, no file is created.
182
+
183
+ Example `hooks/seed_default_users.rb`:
184
+
185
+ ```ruby
186
+ insert_sql <<~SQL
187
+ -- seed default users for tenants <%= cli_options.fetch(:ids).join(',') %>
188
+ <%- cli_options.fetch(:ids).each do |tenant_id| -%>
189
+ INSERT INTO users (tenant_id, email) VALUES (<%= tenant_id %>, 'default@example.com');
190
+ <%- end -%>
191
+ SQL
192
+ ```
193
+
194
+ **Shell hook**: anything other than `.rb` is exec'd as a child process. It is a pure side-effect hook — exwiw does not capture its stdout. The hook receives these env vars and inherits `DATABASE_PASSWORD` from the parent:
195
+
196
+ - `EXWIW_OUTPUT_DIR`, `EXWIW_CONFIG_DIR`
197
+ - `EXWIW_DATABASE_ADAPTER`, `EXWIW_DATABASE_HOST`, `EXWIW_DATABASE_PORT`, `EXWIW_DATABASE_USER`, `EXWIW_DATABASE_NAME`
198
+ - `EXWIW_TARGET_TABLE`, `EXWIW_IDS` (comma-separated), `EXWIW_OUTPUT_FORMAT`
199
+
200
+ A non-zero exit code from the shell hook aborts exwiw.
201
+
202
+ Note: Ruby hooks are evaluated via `instance_eval` inside the exwiw process — only pass paths you trust.
203
+
204
+ ### Skip a table
205
+
206
+ Set `"skip": true` on a table's config JSON to explicitly exclude it from the dump. The table is omitted from `insert-000-schema.{sql,js}`, and no `insert-*` / `delete-*` files are generated for it. Skipped tables are also not queried at all.
207
+
208
+ ```json
209
+ {
210
+ "name": "audit_logs",
211
+ "primary_key": "id",
212
+ "skip": true,
213
+ "belongs_tos": [],
214
+ "columns": [{ "name": "id" }]
215
+ }
216
+ ```
217
+
218
+ Constraints:
219
+
220
+ - If another non-skipped table has a `belongs_to` entry pointing at a skipped table, exwiw raises `ArgumentError` on load. Remove the `belongs_to` entry on the referencing table, or unset `skip` on the referenced table.
221
+ - Specifying a skipped table as `--target-table` raises `ArgumentError`.
222
+ - `skip: true` is preserved by `exwiw:schema:generate` regenerations (the receiver value wins over the auto-generated config).
223
+
163
224
  ### Bulk insert chunk size
164
225
 
165
226
  `bulk_insert_chunk_size` splits the generated `INSERT` statement into multiple statements, each containing at most the specified number of rows. This is useful when the number of records per table is large enough to hit limits like MySQL's `max_allowed_packet`.
@@ -0,0 +1,189 @@
1
+ # Plan: `--after-insert-hook` フック (Ruby DSL / shell script)
2
+
3
+ ## Context
4
+
5
+ 現状 `exwiw` は `insert-000-schema.{sql,js}` → `insert-NNN-{table}.{sql,jsonl}` → `delete-NNN-...` を生成して終わる。実運用では「import 後に特定テナントへデフォルトユーザを挿入する」「監査ログを 1 行打つ」など、抽出結果を踏まえた**後処理 SQL / 副作用**を続けて流したいケースがある。
6
+
7
+ これを毎回別ファイルとして手で書き足すのは面倒なので、抽出ジョブの一部としてフックを記述できるようにする。フックは `--ids` などの CLI オプションを参照できるべき (例: 「抽出対象のテナント ID 配列に対してデフォルトユーザを seed」)。
8
+
9
+ ゴール:
10
+
11
+ - `--after-insert=PATH` オプションを追加。`PATH` には `.rb` または `.sh` を指定可。
12
+ - 拡張子 `.rb`: 軽量 DSL (`cli_options`, `insert_sql` / `insert_jsonl`) を提供。文字列引数は ERB として評価され、結果が連結されて最後尾の insert ファイルとして書き出される。
13
+ - 拡張子 `.sh` (および `.rb` 以外): 環境変数で CLI オプションを渡したうえで子プロセスとして実行。出力ファイルは生成しない (純粋な副作用フック)。
14
+ - フックは per-table の insert/delete ループ完了後に 1 度だけ実行される。
15
+
16
+ ## Design
17
+
18
+ ### CLI レイヤー
19
+ **File**: `lib/exwiw/cli.rb`
20
+
21
+ - `@after_insert_path = nil` を初期化 (`initialize`)。
22
+ - `parser` 内に `opts.on("--after-insert=[PATH]", "Path to a .rb or .sh post-processing hook") { |v| @after_insert_path = File.expand_path(v) }` を追加。
23
+ - `validate_options!` で以下を検証:
24
+ - パスが存在しないとき: `$stderr.puts "--after-insert file not found: #{@after_insert_path}"; exit 1`
25
+ - 拡張子が `.rb` でも `.sh` でもなく、かつ実行可能ビットも立っていないとき: `--after-insert must be a .rb file or an executable script` で exit 1。
26
+ - `cli_options` 用に **CLI 全オプションを Hash 化するメソッド** を追加 (`build_cli_options_hash`):
27
+ ```ruby
28
+ {
29
+ database_host: @database_host, database_port: @database_port,
30
+ database_user: @database_user, database_password: @database_password,
31
+ output_dir: @output_dir, config_dir: @config_dir,
32
+ database_adapter: @database_adapter, database_name: @database_name,
33
+ target_table: @target_table_name, ids: @ids.dup.freeze,
34
+ output_format: @output_format, insert_only: @insert_only,
35
+ log_level: @log_level, after_insert: @after_insert_path,
36
+ }.freeze
37
+ ```
38
+ - `Runner.new(...)` 呼び出しに `after_insert_path: @after_insert_path, cli_options: build_cli_options_hash` を追加。
39
+
40
+ ### Runner 統合
41
+ **File**: `lib/exwiw/runner.rb`
42
+
43
+ - `initialize` のキーワード引数に `after_insert_path: nil, cli_options: {}` を追加し instance var に格納。
44
+ - `run` の per-table ループ (`ordered_table_names.each_with_index`) が終わった**直後** (現状の line 98 の直後) で:
45
+ ```ruby
46
+ if @after_insert_path
47
+ @logger.info("Running after-insert hook: #{@after_insert_path}")
48
+ AfterInsertHook.run(
49
+ path: @after_insert_path,
50
+ cli_options: @cli_options,
51
+ output_dir: @output_dir,
52
+ next_idx: total_size + 1,
53
+ output_extension: adapter.output_extension,
54
+ logger: @logger,
55
+ )
56
+ end
57
+ ```
58
+ - `total_size` は既存変数 (`ordered_table_names.size`)。schema が `000`、per-table が `001..total_size` を使うので、フック出力は `total_size + 1` 番。`delete-*` は逆順番号なので衝突しない。
59
+
60
+ ### 新規ファイル: `lib/exwiw/after_insert_hook.rb`
61
+
62
+ ```ruby
63
+ require 'erb'
64
+ require 'shellwords'
65
+
66
+ module Exwiw
67
+ class AfterInsertHook
68
+ def self.run(path:, cli_options:, output_dir:, next_idx:, output_extension:, logger:)
69
+ ext = File.extname(path)
70
+ idx_str = next_idx.to_s.rjust(3, '0')
71
+ output_path = File.join(output_dir, "insert-#{idx_str}-after_insert.#{output_extension}")
72
+
73
+ case ext
74
+ when '.rb'
75
+ run_ruby(path: path, cli_options: cli_options, output_path: output_path, logger: logger)
76
+ else
77
+ run_shell(path: path, cli_options: cli_options, output_dir: output_dir, logger: logger)
78
+ end
79
+ end
80
+
81
+ def self.run_ruby(path:, cli_options:, output_path:, logger:)
82
+ ctx = Context.new(cli_options)
83
+ ctx.instance_eval(File.read(path), path)
84
+ sql = ctx.collected.join("\n")
85
+ if sql.empty?
86
+ logger.info("After-insert hook produced no output; skipping file write.")
87
+ return
88
+ end
89
+ File.write(output_path, sql)
90
+ logger.info("Wrote after-insert hook output to #{output_path}")
91
+ end
92
+
93
+ def self.run_shell(path:, cli_options:, output_dir:, logger:)
94
+ env = {
95
+ 'EXWIW_OUTPUT_DIR' => output_dir,
96
+ 'EXWIW_CONFIG_DIR' => cli_options[:config_dir].to_s,
97
+ 'EXWIW_DATABASE_ADAPTER' => cli_options[:database_adapter].to_s,
98
+ 'EXWIW_DATABASE_HOST' => cli_options[:database_host].to_s,
99
+ 'EXWIW_DATABASE_PORT' => cli_options[:database_port].to_s,
100
+ 'EXWIW_DATABASE_USER' => cli_options[:database_user].to_s,
101
+ 'EXWIW_DATABASE_NAME' => cli_options[:database_name].to_s,
102
+ 'EXWIW_TARGET_TABLE' => cli_options[:target_table].to_s,
103
+ 'EXWIW_IDS' => Array(cli_options[:ids]).join(','),
104
+ 'EXWIW_OUTPUT_FORMAT' => cli_options[:output_format].to_s,
105
+ }
106
+ # DATABASE_PASSWORD は既存 ENV をそのまま受け継がせる (env hash で上書きしない)。
107
+ ok = system(env, path)
108
+ raise "after-insert shell hook failed: #{path}" unless ok
109
+ end
110
+
111
+ class Context
112
+ attr_reader :cli_options, :collected
113
+
114
+ def initialize(cli_options)
115
+ @cli_options = cli_options
116
+ @collected = []
117
+ end
118
+
119
+ # ERB 評価。MongoDB 向けに `insert_jsonl` の別名も提供する (出力ファイル名・拡張子は
120
+ # Runner 側で adapter.output_extension を元に決まるので、どちらを呼んでも同じバッファに溜まる)。
121
+ def insert_sql(template)
122
+ @collected << ERB.new(template, trim_mode: '-').result(binding)
123
+ end
124
+ alias_method :insert_jsonl, :insert_sql
125
+ end
126
+ end
127
+ end
128
+ ```
129
+
130
+ ポイント:
131
+ - `instance_eval(File.read(path), path)` で、フックファイル内では `cli_options` と `insert_sql` が単純なメソッド呼び出しとして使える (DSL 風)。
132
+ - ERB 評価は `Context#insert_sql` の `binding` を使うため、ERB テンプレート内でも `cli_options.fetch(:ids)` が呼べる。
133
+ - `insert_sql` / `insert_jsonl` を複数回呼ぶと `@collected` に積まれ、最後に `"\n"` で連結して 1 ファイルに書く。
134
+ - 空出力なら書き出さない (idempotency に近い挙動)。
135
+ - shell 実行は `system(env, path)`。`path` は単独引数として渡すのでシェル展開されない (Shellwords 必要なし)。失敗時は exception で停止 → exit。
136
+ - ENV 名は `EXWIW_*` 接頭辞で名前空間を切る。`DATABASE_PASSWORD` は親プロセスの ENV を継承させる (フック側で読みたければ読める)。
137
+
138
+ ### lib/exwiw.rb への require 追加
139
+ `require_relative "exwiw/after_insert_hook"` を `runner.rb` の require の前後あたりに追加。
140
+
141
+ ### 使用例 (README に追記)
142
+
143
+ `hooks/seed_default_users.rb`:
144
+ ```ruby
145
+ # cli_options[:ids] には --ids で渡された配列が入る
146
+ insert_sql <<~SQL
147
+ -- seed default users for tenants <%= cli_options.fetch(:ids).join(',') %>
148
+ <%- cli_options.fetch(:ids).each do |tenant_id| -%>
149
+ INSERT INTO users (tenant_id, email) VALUES (<%= tenant_id %>, 'default@example.com');
150
+ <%- end -%>
151
+ SQL
152
+ ```
153
+
154
+ 実行:
155
+ ```
156
+ exwiw --adapter=mysql2 ... --target-table=shops --ids=1,2 \
157
+ --after-insert=hooks/seed_default_users.rb
158
+ ```
159
+ 結果: `dump/insert-{total+1}-after_insert.sql` が、テナント 1,2 用の INSERT を含めて出力される。
160
+
161
+ ## Files to modify / add
162
+
163
+ | パス | 変更 |
164
+ |---|---|
165
+ | `lib/exwiw/cli.rb` | `--after-insert` parse / validate / `build_cli_options_hash` 追加、Runner 呼び出しへ伝搬 |
166
+ | `lib/exwiw/runner.rb` | `initialize` に `after_insert_path:`, `cli_options:` 追加、per-table ループ完了後にフック呼び出し |
167
+ | `lib/exwiw/after_insert_hook.rb` (新規) | `AfterInsertHook.run` + `Context` (DSL + ERB) |
168
+ | `lib/exwiw.rb` | `require_relative "exwiw/after_insert_hook"` 追加 |
169
+ | `README.md` | `--after-insert` の節を追加 (Ruby DSL 例 / shell hook 例 / 環境変数一覧) |
170
+ | `spec/runner_spec.rb` | Ruby フックで `insert-{N+1}-after_insert.sql` が書き出され、ERB で `cli_options.fetch(:ids)` が展開されることを assert |
171
+ | `spec/after_insert_hook_spec.rb` (新規) | `Context#insert_sql` の ERB 評価、複数回呼び出しが `"\n"` 連結されることを assert |
172
+
173
+ ## Verification
174
+
175
+ 1. **ユニットテスト**: `bundle exec rspec spec/after_insert_hook_spec.rb` — `Context#insert_sql` を直接叩いて、ERB が `cli_options.fetch(:ids)` を解決できること、複数回呼び出しが `\n` 連結されることを確認。
176
+ 2. **統合テスト**: `bundle exec rspec spec/runner_spec.rb` — sqlite3 経由で実際に Runner を流し、tmp に書いたフック `.rb` を `--after-insert` 相当で渡し、`tmp/.../insert-{N+1}-after_insert.sql` が生成されることと、ファイル内に ERB 展開後の `--ids` の値が含まれることを assert。
177
+ 3. **CLI E2E**: scenario の `test_with_sqlite3.sh` を一時的に編集して `--after-insert=` を付けた呼び出しを試し、出力ディレクトリに想定どおりのファイルが置かれることを目視確認。MongoDB は `--after-insert=hook.rb` で `insert_jsonl` を使ったときに `.jsonl` が出ることのみ smoke-test。
178
+ 4. **エッジケース確認**:
179
+ - `--after-insert=missing.rb` でわかりやすいエラー終了。
180
+ - フック内で `insert_sql` を 1 度も呼ばなかったとき、ファイルは作られず info ログのみ。
181
+ - shell hook の non-zero exit code で Runner が落ちる。
182
+ - `--ids` 省略時に `cli_options.fetch(:ids)` が空配列を返す (`@ids = []` 初期値が保たれる)。
183
+
184
+ ## 留意点 / 既知のリスク
185
+
186
+ - **任意コード実行**: `.rb` フックは `instance_eval` で実行される (= exwiw プロセスと同じ権限で動く)。ユーザ自身が用意した hook を渡す前提なので問題ないが、README に「信頼するソースのみ」と注意書きを入れる。
187
+ - **ファイル番号の衝突**: `delete-*` は逆順番号 (`total_size - idx`) を使うので、`insert-{total+1}-after_insert` と直接衝突はしない (`delete-001`...`delete-{total}` の範囲)。
188
+ - **MongoDB 対応の限界**: `insert_jsonl` は ERB 出力結果を `.jsonl` としてそのまま書き出す。1 行 1 ドキュメントの形に揃えるのはユーザ責任。`mongoimport` で流せる前提。
189
+ - **password の取り扱い**: shell hook の env に `DATABASE_PASSWORD` を明示的に詰めない (親プロセス ENV を継承させる)。プロセス一覧経由の漏えいを防ぐため、ENV を介すのは hash で `EXWIW_*` のみ。
@@ -0,0 +1,91 @@
1
+ # PostgreSQL COPY モードの SQL 妥当性を検証する scenario_test 追加
2
+
3
+ ## Context
4
+
5
+ `exwiw` の PostgreSQL アダプターは `--output-format=copy` で `COPY ... FROM stdin;` 形式の出力に切り替えられる(`lib/exwiw/adapter/postgresql_adapter.rb:77-85`)。既存のテストでは:
6
+
7
+ - 単体テスト(`spec/adapter/postgresql_adapter_spec.rb:256-305`)が文字列フォーマットを検証
8
+ - ランナー統合テスト(`spec/runner_spec.rb:236-286`)がファイル構造を検証
9
+
10
+ しかし **生成された COPY-mode SQL を実際に `psql -f` で取り込めるかを検証する end-to-end テストが存在しない**。ユーザーは COPY モードで invalid な SQL が出ているのではと疑っており、それを実DBに対して検証したい。
11
+
12
+ 既存の INSERT モードは `scenario/test_with_postgresql.sh` が `psql -f` での再取込まで含めて検証している。これに対応する COPY モード版が無い状態。
13
+
14
+ ゴール: COPY モード出力を実際に psql に食わせる E2E シナリオ + スナップショット回帰テストを追加し、潜在的な invalid SQL を表面化する。
15
+
16
+ ## 変更ファイル
17
+
18
+ 1. **新規** `scenario/test_with_postgresql_copy.sh` — E2E シェル
19
+ 2. **修正** `spec/insert_output_snapshot_spec.rb` — COPY 用の SCENARIOS エントリと `snapshot_subdir` 対応
20
+ 3. **修正** `.github/workflows/scenario.yml` — `with_postgres` ジョブに新ステップ
21
+ 4. **新規** `spec/insert_output_snapshots/postgresql-copy/insert-*.sql` — `UPDATE_SNAPSHOTS=1` で自動生成
22
+
23
+ ## 詳細
24
+
25
+ ### 1. `scenario/test_with_postgresql_copy.sh`
26
+
27
+ `scenario/test_with_postgresql.sh` を雛形にして以下のみ差し替え:
28
+
29
+ - `FROM_DATABASE_NAME="exwiw_scenario_prod_db_copy"`
30
+ - `TO_DATABASE_NAME="exwiw_scenario_dev_db_copy"`(並列実行されても既存シナリオと衝突しない名前)
31
+ - `exe/exwiw` に `--output-format=copy` を追加
32
+ - `--output-dir=tmp/postgresql-copy` に変更
33
+ - `delete-*.sql` / `insert-*.sql` のループも `tmp/postgresql-copy/` を参照
34
+
35
+ `set -e` により、psql が COPY ブロックの構文/データエラーで終了したら即時失敗する。これがユーザーが疑う「invalid SQL」の検出ポイント。
36
+
37
+ 末尾の検証(`INSERT INTO shops ... ` がオートインクリメントで通るか)はそのまま流用 — `to_copy_from_stdin` の後ろに付く `post_insert_sql`(sequence の setval)まで含めて検証される。
38
+
39
+ 実行権限 `chmod +x` を付与(兄弟スクリプトに合わせる)。
40
+
41
+ ### 2. `spec/insert_output_snapshot_spec.rb`
42
+
43
+ - 78 行目の `snapshot_dir` を `scenario[:snapshot_subdir] || scenario[:adapter]` に変更し、同一 adapter で複数シナリオを持てるようにする
44
+ - 76 行目の context ラベルに `output_format` がある場合のサフィックスを足して、rspec 出力で区別可能にする
45
+ - `SCENARIOS` 配列に以下を追加:
46
+
47
+ ```ruby
48
+ {
49
+ adapter: "postgresql",
50
+ config_dir: "scenario/postgresql-schema",
51
+ output_format: "copy",
52
+ snapshot_subdir: "postgresql-copy",
53
+ connection: { adapter: "postgresql", database_name: "exwiw_test",
54
+ host: "127.0.0.1", port: 5432,
55
+ user: "postgres", password: "test_password" },
56
+ },
57
+ ```
58
+
59
+ 86 行目の `scenario.fetch(:output_format, "insert")` は既に存在するので追加変更不要。`insert-000-schema.sql` の pg_dump 正規化(21-25 行)も同 adapter なのでそのまま効く。
60
+
61
+ ### 3. `.github/workflows/scenario.yml`
62
+
63
+ `with_postgres` ジョブの「Run exwiw (from clean target DB)」ステップ(115 行目)の後に追加:
64
+
65
+ ```yaml
66
+ - name: Run exwiw (copy mode)
67
+ run: scenario/test_with_postgresql_copy.sh
68
+ ```
69
+
70
+ `postgres:17-alpine` サービスと `postgresql-client-17` インストールは既存ステップで完了済みなので追加不要。
71
+
72
+ ### 4. スナップショット生成
73
+
74
+ ```
75
+ UPDATE_SNAPSHOTS=1 bundle exec rspec spec/insert_output_snapshot_spec.rb
76
+ ```
77
+
78
+ `spec/insert_output_snapshots/postgresql-copy/` 配下に `insert-000-schema.sql` + `insert-001-shops.sql` ... `insert-007-transactions.sql` 相当が生成される。これを git に含める。
79
+
80
+ ## 検証手順
81
+
82
+ 1. ローカルで `docker compose up -d postgres` を起動
83
+ 2. `bash scenario/test_with_postgresql_copy.sh` を実行 — exit 0 ならば COPY モード SQL は psql 経由で valid。non-zero なら invalid SQL が表面化(その時点で原因を特定して別途修正)
84
+ 3. `UPDATE_SNAPSHOTS=1 bundle exec rspec spec/insert_output_snapshot_spec.rb` でスナップショットを生成
85
+ 4. `bundle exec rspec spec/insert_output_snapshot_spec.rb` を `UPDATE_SNAPSHOTS` 無しで再実行し、全シナリオ(sqlite3 / mysql2 / postgresql / postgresql-copy / mongodb)が通ることを確認
86
+ 5. CI 上で `with_postgres` ジョブの新ステップ `Run exwiw (copy mode)` が通る(または invalid SQL を検出する)ことを確認
87
+
88
+ ## 想定される結果の分岐
89
+
90
+ - **テストが通った場合**: ユーザーの疑いは(少なくとも seed データの範囲では)杞憂。回帰テストとして残り、今後 COPY モード周りの改修で SQL を壊した時に早期検出できる
91
+ - **テストが落ちた場合**: 落ち方(psql のエラーメッセージ)から原因を特定。修正は本プランの範囲外として別タスクで対応する(ユーザーに報告 → 方針決定 → 実装)
@@ -97,6 +97,10 @@ module Exwiw
97
97
  raise NotImplementedError, "MongodbAdapter does not support bulk delete"
98
98
  end
99
99
 
100
+ def explain(_query)
101
+ raise NotImplementedError, "MongodbAdapter does not support explain yet"
102
+ end
103
+
100
104
  def output_extension
101
105
  'jsonl'
102
106
  end
@@ -14,6 +14,17 @@ module Exwiw
14
14
  connection.query(sql, cast: false, as: :array).to_a
15
15
  end
16
16
 
17
+ def explain(query_ast)
18
+ sql = compile_ast(query_ast)
19
+
20
+ @logger.debug(" Executing EXPLAIN: \n#{sql}")
21
+ rows = connection.query("EXPLAIN #{sql}", cast: false).to_a
22
+ rows.each_with_index.flat_map do |row, i|
23
+ ["*************************** #{i + 1}. row ***************************"] +
24
+ row.map { |k, v| "#{k}: #{v}" }
25
+ end.join("\n")
26
+ end
27
+
17
28
  def dump_schema(ordered_tables, output_path)
18
29
  require 'open3'
19
30
 
@@ -14,6 +14,13 @@ module Exwiw
14
14
  connection.exec(sql).values
15
15
  end
16
16
 
17
+ def explain(query_ast)
18
+ sql = compile_ast(query_ast)
19
+
20
+ @logger.debug(" Executing EXPLAIN: \n#{sql}")
21
+ connection.exec("EXPLAIN #{sql}").values.map(&:first).join("\n")
22
+ end
23
+
17
24
  def dump_schema(ordered_tables, output_path)
18
25
  require 'open3'
19
26
 
@@ -14,6 +14,14 @@ module Exwiw
14
14
  connection.execute(sql)
15
15
  end
16
16
 
17
+ def explain(query_ast)
18
+ sql = compile_ast(query_ast)
19
+
20
+ @logger.debug(" Executing EXPLAIN QUERY PLAN: \n#{sql}")
21
+ rows = connection.execute("EXPLAIN QUERY PLAN #{sql}")
22
+ rows.map { |row| row[3] }.join("\n")
23
+ end
24
+
17
25
  def dump_schema(ordered_tables, output_path)
18
26
  @logger.debug(" Reading schema from sqlite_master...")
19
27
  target_names = ordered_tables.map(&:name)
data/lib/exwiw/adapter.rb CHANGED
@@ -74,6 +74,13 @@ module Exwiw
74
74
  def to_copy_from_stdin(_results, _table)
75
75
  raise NotImplementedError, "COPY format is not supported by #{self.class.name}"
76
76
  end
77
+
78
+ # Run the database-specific EXPLAIN for the given query and return the
79
+ # output as a single string for `explain` subcommand to print.
80
+ # SQL adapters override; MongodbAdapter currently raises.
81
+ def explain(_query_ast)
82
+ raise NotImplementedError, "#{self.class.name} does not implement #explain"
83
+ end
77
84
  end
78
85
 
79
86
  # @params [Exwiw::QueryAst] query_ast
@@ -0,0 +1,63 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'erb'
4
+
5
+ module Exwiw
6
+ class AfterInsertHook
7
+ def self.run(path:, cli_options:, output_dir:, next_idx:, output_extension:, logger:)
8
+ ext = File.extname(path)
9
+ idx_str = next_idx.to_s.rjust(3, '0')
10
+ output_path = File.join(output_dir, "insert-#{idx_str}-after_insert.#{output_extension}")
11
+
12
+ if ext == '.rb'
13
+ run_ruby(path: path, cli_options: cli_options, output_path: output_path, logger: logger)
14
+ else
15
+ run_shell(path: path, cli_options: cli_options, output_dir: output_dir, logger: logger)
16
+ end
17
+ end
18
+
19
+ def self.run_ruby(path:, cli_options:, output_path:, logger:)
20
+ ctx = Context.new(cli_options)
21
+ ctx.instance_eval(File.read(path), path)
22
+ content = ctx.collected.join("\n")
23
+ if content.empty?
24
+ logger.info("After-insert hook produced no output; skipping file write.")
25
+ return
26
+ end
27
+ File.write(output_path, content)
28
+ logger.info("Wrote after-insert hook output to #{output_path}")
29
+ end
30
+
31
+ def self.run_shell(path:, cli_options:, output_dir:, logger:)
32
+ env = {
33
+ 'EXWIW_OUTPUT_DIR' => output_dir,
34
+ 'EXWIW_CONFIG_DIR' => cli_options[:config_dir].to_s,
35
+ 'EXWIW_DATABASE_ADAPTER' => cli_options[:database_adapter].to_s,
36
+ 'EXWIW_DATABASE_HOST' => cli_options[:database_host].to_s,
37
+ 'EXWIW_DATABASE_PORT' => cli_options[:database_port].to_s,
38
+ 'EXWIW_DATABASE_USER' => cli_options[:database_user].to_s,
39
+ 'EXWIW_DATABASE_NAME' => cli_options[:database_name].to_s,
40
+ 'EXWIW_TARGET_TABLE' => cli_options[:target_table].to_s,
41
+ 'EXWIW_IDS' => Array(cli_options[:ids]).join(','),
42
+ 'EXWIW_OUTPUT_FORMAT' => cli_options[:output_format].to_s,
43
+ }
44
+ logger.info("Running after-insert shell hook: #{path}")
45
+ ok = system(env, path)
46
+ raise "after-insert shell hook failed: #{path}" unless ok
47
+ end
48
+
49
+ class Context
50
+ attr_reader :cli_options, :collected
51
+
52
+ def initialize(cli_options)
53
+ @cli_options = cli_options
54
+ @collected = []
55
+ end
56
+
57
+ def insert_sql(template)
58
+ @collected << ERB.new(template, trim_mode: '-').result(binding)
59
+ end
60
+ alias_method :insert_jsonl, :insert_sql
61
+ end
62
+ end
63
+ end
data/lib/exwiw/cli.rb CHANGED
@@ -10,25 +10,37 @@ require 'exwiw'
10
10
 
11
11
  module Exwiw
12
12
  class CLI
13
+ KNOWN_SUBCOMMANDS = %w[dump explain].freeze
14
+
13
15
  def self.start(argv)
14
16
  new(argv).run
15
17
  end
16
18
 
17
19
  def initialize(argv)
18
20
  @argv = argv.dup
19
- @help = argv.empty?
21
+
22
+ @subcommand =
23
+ if !@argv.empty? && !@argv.first.start_with?("-") && KNOWN_SUBCOMMANDS.include?(@argv.first)
24
+ @argv.shift
25
+ else
26
+ "dump"
27
+ end
28
+
29
+ @help = @argv.empty?
20
30
 
21
31
  @database_host = nil
22
32
  @database_port = nil
23
33
  @database_user = nil
24
34
  @database_password = ENV["DATABASE_PASSWORD"]
25
- @output_dir = "dump"
35
+ @output_dir = nil
26
36
  @config_dir = nil
27
37
  @database_adapter = nil
28
38
  @database_name = nil
29
39
  @target_table_name = nil
30
40
  @ids = []
31
- @output_format = 'insert'
41
+ @output_format = nil
42
+ @insert_only = nil
43
+ @after_insert_hook_path = nil
32
44
  @log_level = :info
33
45
 
34
46
  parser.parse!(@argv)
@@ -37,37 +49,56 @@ module Exwiw
37
49
  def run
38
50
  if @help
39
51
  puts parser.help
40
- else
41
- validate_options!
52
+ return
53
+ end
42
54
 
43
- connection_config = ConnectionConfig.new(
44
- adapter: @database_adapter,
45
- host: @database_host,
46
- port: @database_port,
47
- user: @database_user,
48
- password: @database_password,
49
- database_name: @database_name,
50
- )
55
+ validate_options!
51
56
 
52
- dump_target = DumpTarget.new(
53
- table_name: @target_table_name,
54
- ids: @ids,
55
- )
57
+ connection_config = ConnectionConfig.new(
58
+ adapter: @database_adapter,
59
+ host: @database_host,
60
+ port: @database_port,
61
+ user: @database_user,
62
+ password: @database_password,
63
+ database_name: @database_name,
64
+ )
56
65
 
57
- logger = build_logger
66
+ dump_target = DumpTarget.new(
67
+ table_name: @target_table_name,
68
+ ids: @ids,
69
+ )
70
+
71
+ logger = build_logger
58
72
 
73
+ case @subcommand
74
+ when "dump"
59
75
  Runner.new(
60
76
  connection_config: connection_config,
61
77
  output_dir: @output_dir,
62
78
  config_dir: @config_dir,
63
79
  dump_target: dump_target,
64
80
  output_format: @output_format,
81
+ insert_only: @insert_only,
82
+ after_insert_hook_path: @after_insert_hook_path,
83
+ cli_options: build_cli_options_hash,
84
+ logger: logger,
85
+ ).run
86
+ when "explain"
87
+ ExplainRunner.new(
88
+ connection_config: connection_config,
89
+ config_dir: @config_dir,
90
+ dump_target: dump_target,
65
91
  logger: logger,
92
+ io: $stdout,
66
93
  ).run
67
94
  end
68
95
  end
69
96
 
70
97
  private def validate_options!
98
+ if @subcommand == "explain"
99
+ validate_explain_only!
100
+ end
101
+
71
102
  if @database_adapter != "sqlite3"
72
103
  required_options = {
73
104
  "Target database host" => @database_host,
@@ -94,15 +125,21 @@ module Exwiw
94
125
  exit 1
95
126
  end
96
127
 
97
- valid_output_formats = ["insert", "copy"]
98
- unless valid_output_formats.include?(@output_format)
99
- $stderr.puts "Invalid output format '#{@output_format}'. Available options are: #{valid_output_formats.join(', ')}"
100
- exit 1
101
- end
128
+ if @subcommand == "dump"
129
+ @output_dir ||= "dump"
130
+ @output_format ||= "insert"
131
+ @insert_only = @insert_only ? true : false
102
132
 
103
- if @output_format == "copy" && @database_adapter != "postgresql"
104
- $stderr.puts "--output-format=copy is only supported with the postgresql adapter"
105
- exit 1
133
+ valid_output_formats = ["insert", "copy"]
134
+ unless valid_output_formats.include?(@output_format)
135
+ $stderr.puts "Invalid output format '#{@output_format}'. Available options are: #{valid_output_formats.join(', ')}"
136
+ exit 1
137
+ end
138
+
139
+ if @output_format == "copy" && @database_adapter != "postgresql"
140
+ $stderr.puts "--output-format=copy is only supported with the postgresql adapter"
141
+ exit 1
142
+ end
106
143
  end
107
144
 
108
145
  if @config_dir.nil?
@@ -129,6 +166,56 @@ module Exwiw
129
166
  $stderr.puts "--target-table is required when --ids is specified"
130
167
  exit 1
131
168
  end
169
+
170
+ if @after_insert_hook_path
171
+ unless File.file?(@after_insert_hook_path)
172
+ $stderr.puts "--after-insert-hook file not found: #{@after_insert_hook_path}"
173
+ exit 1
174
+ end
175
+
176
+ ext = File.extname(@after_insert_hook_path)
177
+ if ext != '.rb' && !File.executable?(@after_insert_hook_path)
178
+ $stderr.puts "--after-insert-hook must be a .rb file or an executable script: #{@after_insert_hook_path}"
179
+ exit 1
180
+ end
181
+ end
182
+ end
183
+
184
+ private def validate_explain_only!
185
+ if @database_adapter == "mongodb"
186
+ $stderr.puts "mongodb adapter is not yet supported by 'explain' subcommand"
187
+ exit 1
188
+ end
189
+
190
+ rejected = []
191
+ rejected << "--output-dir" unless @output_dir.nil?
192
+ rejected << "--output-format" unless @output_format.nil?
193
+ rejected << "--insert-only" unless @insert_only.nil?
194
+ rejected << "--after-insert-hook" unless @after_insert_hook_path.nil?
195
+
196
+ unless rejected.empty?
197
+ $stderr.puts "The following options are not applicable in 'explain' subcommand: #{rejected.join(', ')}"
198
+ exit 1
199
+ end
200
+ end
201
+
202
+ private def build_cli_options_hash
203
+ {
204
+ database_host: @database_host,
205
+ database_port: @database_port,
206
+ database_user: @database_user,
207
+ database_password: @database_password,
208
+ output_dir: @output_dir,
209
+ config_dir: @config_dir,
210
+ database_adapter: @database_adapter,
211
+ database_name: @database_name,
212
+ target_table: @target_table_name,
213
+ ids: @ids.dup.freeze,
214
+ output_format: @output_format,
215
+ insert_only: @insert_only,
216
+ log_level: @log_level,
217
+ after_insert_hook: @after_insert_hook_path,
218
+ }.freeze
132
219
  end
133
220
 
134
221
  private def build_logger
@@ -148,13 +235,22 @@ module Exwiw
148
235
 
149
236
  private def parser
150
237
  @parser ||= OptionParser.new do |opts|
151
- opts.banner = "exwiw #{Exwiw::VERSION}"
238
+ opts.banner = <<~BANNER
239
+ exwiw #{Exwiw::VERSION}
240
+
241
+ Usage: exwiw [SUBCOMMAND] [options]
242
+
243
+ Subcommands:
244
+ dump Generate INSERT/COPY SQL files (default when omitted).
245
+ explain Print EXPLAIN output for each extraction query to stdout.
246
+ (not yet supported for the mongodb adapter)
247
+ BANNER
152
248
  opts.version = Exwiw::VERSION
153
249
 
154
250
  opts.on("-h", "--host=HOST", "Target database host") { |v| @database_host = v }
155
251
  opts.on("-p", "--port=PORT", "Target database port") { |v| @database_port = v }
156
252
  opts.on("-u", "--user=USERNAME", "Target database user") { |v| @database_user = v }
157
- opts.on("-o", "--output-dir=[DUMP_DIR_PATH]", "Output file path. default is dump/") do |v|
253
+ opts.on("-o", "--output-dir=[DUMP_DIR_PATH]", "Output file path. default is dump/ (dump subcommand only)") do |v|
158
254
  v = v.end_with?("/") ? v[0..-2] : v
159
255
  @output_dir = File.expand_path(v)
160
256
  end
@@ -166,7 +262,11 @@ module Exwiw
166
262
  opts.on("--database=DATABASE", "Target database name") { |v| @database_name = v }
167
263
  opts.on("--target-table=[TABLE]", "Target table for extraction. If omitted, dump all tables.") { |v| @target_table_name = v }
168
264
  opts.on("--ids=[IDS]", "Comma-separated list of identifiers. Required when --target-table is given.") { |v| @ids = v.split(',') }
169
- opts.on("--output-format=[FORMAT]", "Output format: insert (default) or copy (PostgreSQL only)") { |v| @output_format = v }
265
+ opts.on("--output-format=[FORMAT]", "Output format: insert (default) or copy (PostgreSQL only, dump subcommand only)") { |v| @output_format = v }
266
+ opts.on("--insert-only", "Do not generate DELETE SQL files (dump subcommand only)") { @insert_only = true }
267
+ opts.on("--after-insert-hook=PATH", "Path to a .rb or .sh post-processing hook executed after all insert/delete files are written (dump subcommand only)") do |v|
268
+ @after_insert_hook_path = File.expand_path(v)
269
+ end
170
270
  opts.on("--log-level=LEVEL", "Log level (debug, info). default is info") { |v| @log_level = v.to_sym }
171
271
 
172
272
  opts.on("--help", "Print this help") do
@@ -0,0 +1,83 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Exwiw
4
+ class ExplainRunner
5
+ def initialize(
6
+ connection_config:,
7
+ config_dir:,
8
+ dump_target:,
9
+ logger:,
10
+ io: $stdout
11
+ )
12
+ @connection_config = connection_config
13
+ @config_dir = config_dir
14
+ @dump_target = dump_target
15
+ @logger = logger
16
+ @io = io
17
+ end
18
+
19
+ def run
20
+ adapter = Adapter.build(@connection_config, @logger)
21
+ configs = load_table_config(adapter.class.table_config_class)
22
+ configs = reject_and_validate_skipped(configs)
23
+
24
+ table_by_name = configs.each_with_object({}) { |config, hash| hash[config.name] = config }
25
+
26
+ target = table_by_name[@dump_target.table_name]
27
+ adapter.validate_as_dump_target!(target) if target
28
+
29
+ @logger.debug("Determining table processing order...")
30
+ ordered_table_names = DetermineTableProcessingOrder.run(configs.select { |c| adapter.dumpable?(c) })
31
+
32
+ total_size = ordered_table_names.size
33
+ ordered_table_names.each_with_index do |table_name, idx|
34
+ @logger.debug("Explaining '#{table_name}'... (#{idx + 1}/#{total_size})")
35
+ table = table_by_name.fetch(table_name)
36
+
37
+ query_ast = adapter.build_query(table, @dump_target, table_by_name)
38
+ sql = adapter.compile_ast(query_ast)
39
+ explain_text = adapter.explain(query_ast)
40
+
41
+ @io.puts "-- [#{idx + 1}/#{total_size}] #{table_name}"
42
+ @io.puts sql
43
+ @io.puts
44
+ @io.puts "-- EXPLAIN:"
45
+ @io.puts explain_text
46
+ @io.puts
47
+ end
48
+ end
49
+
50
+ private def load_table_config(klass)
51
+ Dir[File.join(@config_dir, "*.json")].map do |file|
52
+ json = JSON.parse(File.read(file))
53
+ klass.from(json)
54
+ end
55
+ end
56
+
57
+ private def reject_and_validate_skipped(configs)
58
+ skipped_names = configs.select { |c| c.skip }.map(&:name).to_set
59
+ return configs if skipped_names.empty?
60
+
61
+ configs.each do |config|
62
+ next if config.skip
63
+ next unless config.respond_to?(:belongs_tos)
64
+
65
+ dangling = config.belongs_tos.select { |rel| skipped_names.include?(rel.table_name) }
66
+ next if dangling.empty?
67
+
68
+ raise ArgumentError,
69
+ "Table '#{config.name}' has belongs_to references to skipped table(s): " \
70
+ "#{dangling.map(&:table_name).join(', ')}. " \
71
+ "Remove the belongs_to entries or unset `skip` on the referenced table."
72
+ end
73
+
74
+ if @dump_target.table_name && skipped_names.include?(@dump_target.table_name)
75
+ raise ArgumentError,
76
+ "--target-table '#{@dump_target.table_name}' is marked skip:true and cannot be used as a dump target."
77
+ end
78
+
79
+ skipped_names.each { |n| @logger.info("Skipping table '#{n}' (skip:true)") }
80
+ configs.reject { |c| c.skip }
81
+ end
82
+ end
83
+ end
@@ -13,6 +13,7 @@ module Exwiw
13
13
  attribute :belongs_tos, array(BelongsTo)
14
14
  attribute :fields, array(MongodbField)
15
15
  attribute :bulk_insert_chunk_size, optional(Integer), skip_serializing_if_nil: true
16
+ attribute :skip, Serdes::OptionalType.new(Serdes::ConcreteType.new(Boolean)), skip_serializing_if_nil: true
16
17
 
17
18
  # Marks this config as physically embedded inside another collection's
18
19
  # documents. When set, this config is not processed as a standalone dump
data/lib/exwiw/runner.rb CHANGED
@@ -10,13 +10,19 @@ module Exwiw
10
10
  config_dir:,
11
11
  dump_target:,
12
12
  logger:,
13
- output_format: 'insert'
13
+ output_format: 'insert',
14
+ insert_only: false,
15
+ after_insert_hook_path: nil,
16
+ cli_options: {}
14
17
  )
15
18
  @connection_config = connection_config
16
19
  @output_dir = output_dir
17
20
  @config_dir = config_dir
18
21
  @dump_target = dump_target
19
22
  @output_format = output_format
23
+ @insert_only = insert_only
24
+ @after_insert_hook_path = after_insert_hook_path
25
+ @cli_options = cli_options
20
26
  @logger = logger
21
27
  end
22
28
 
@@ -24,6 +30,8 @@ module Exwiw
24
30
  adapter = Adapter.build(@connection_config, @logger)
25
31
  configs = load_table_config(adapter.class.table_config_class)
26
32
 
33
+ configs = reject_and_validate_skipped(configs)
34
+
27
35
  table_by_name = configs.each_with_object({}) { |config, hash| hash[config.name] = config }
28
36
 
29
37
  target = table_by_name[@dump_target.table_name]
@@ -80,7 +88,7 @@ module Exwiw
80
88
  end
81
89
  end
82
90
 
83
- if adapter.supports_bulk_delete?
91
+ if adapter.supports_bulk_delete? && !@insert_only
84
92
  @logger.debug(" Generate DELETE statement...")
85
93
  delete_sql = adapter.to_bulk_delete(query_ast, table)
86
94
  if @logger.debug?
@@ -94,6 +102,18 @@ module Exwiw
94
102
  end
95
103
  end
96
104
  end
105
+
106
+ if @after_insert_hook_path
107
+ @logger.info("Running after-insert hook: #{@after_insert_hook_path}")
108
+ AfterInsertHook.run(
109
+ path: @after_insert_hook_path,
110
+ cli_options: @cli_options,
111
+ output_dir: @output_dir,
112
+ next_idx: total_size + 1,
113
+ output_extension: adapter.output_extension,
114
+ logger: @logger,
115
+ )
116
+ end
97
117
  end
98
118
 
99
119
  private def load_table_config(klass)
@@ -102,5 +122,31 @@ module Exwiw
102
122
  klass.from(json)
103
123
  end
104
124
  end
125
+
126
+ private def reject_and_validate_skipped(configs)
127
+ skipped_names = configs.select { |c| c.skip }.map(&:name).to_set
128
+ return configs if skipped_names.empty?
129
+
130
+ configs.each do |config|
131
+ next if config.skip
132
+ next unless config.respond_to?(:belongs_tos)
133
+
134
+ dangling = config.belongs_tos.select { |rel| skipped_names.include?(rel.table_name) }
135
+ next if dangling.empty?
136
+
137
+ raise ArgumentError,
138
+ "Table '#{config.name}' has belongs_to references to skipped table(s): " \
139
+ "#{dangling.map(&:table_name).join(', ')}. " \
140
+ "Remove the belongs_to entries or unset `skip` on the referenced table."
141
+ end
142
+
143
+ if @dump_target.table_name && skipped_names.include?(@dump_target.table_name)
144
+ raise ArgumentError,
145
+ "--target-table '#{@dump_target.table_name}' is marked skip:true and cannot be used as a dump target."
146
+ end
147
+
148
+ skipped_names.each { |n| @logger.info("Skipping table '#{n}' (skip:true)") }
149
+ configs.reject { |c| c.skip }
150
+ end
105
151
  end
106
152
  end
@@ -10,6 +10,7 @@ module Exwiw
10
10
  attribute :belongs_tos, array(BelongsTo)
11
11
  attribute :columns, array(TableColumn)
12
12
  attribute :bulk_insert_chunk_size, optional(Integer), skip_serializing_if_nil: true
13
+ attribute :skip, Serdes::OptionalType.new(Serdes::ConcreteType.new(Boolean)), skip_serializing_if_nil: true
13
14
 
14
15
  def self.from_symbol_keys(hash)
15
16
  from(JSON.parse(hash.to_json))
@@ -76,6 +77,7 @@ module Exwiw
76
77
  merged_table.filter = filter
77
78
  merged_table.belongs_tos = passed_table.belongs_tos
78
79
  merged_table.bulk_insert_chunk_size = passed_table.bulk_insert_chunk_size
80
+ merged_table.skip = skip
79
81
 
80
82
  receiver_column_by_name = columns.each_with_object({}) { |column, hash| hash[column.name] = column }
81
83
 
data/lib/exwiw/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Exwiw
4
- VERSION = "0.1.9"
4
+ VERSION = "0.2.1"
5
5
  end
data/lib/exwiw.rb CHANGED
@@ -21,7 +21,9 @@ require_relative "exwiw/determine_table_processing_order"
21
21
  require_relative "exwiw/mongo_query"
22
22
  require_relative "exwiw/query_ast"
23
23
  require_relative "exwiw/query_ast_builder"
24
+ require_relative "exwiw/after_insert_hook"
24
25
  require_relative "exwiw/runner"
26
+ require_relative "exwiw/explain_runner"
25
27
  require_relative "exwiw/schema_generator"
26
28
 
27
29
  begin
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: exwiw
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.9
4
+ version: 0.2.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Shia
@@ -37,6 +37,8 @@ files:
37
37
  - README.md
38
38
  - docs/plans/2026-05-15-insert-000-schema-file.md
39
39
  - docs/plans/2026-05-16-mongodb-from-clean-scenario.md
40
+ - docs/plans/2026-05-22-after-insert-hook.md
41
+ - docs/plans/2026-05-22-postgres-copy-mode-scenario-test.md
40
42
  - exe/exwiw
41
43
  - lib/exwiw.rb
42
44
  - lib/exwiw/adapter.rb
@@ -44,11 +46,13 @@ files:
44
46
  - lib/exwiw/adapter/mysql2_adapter.rb
45
47
  - lib/exwiw/adapter/postgresql_adapter.rb
46
48
  - lib/exwiw/adapter/sqlite3_adapter.rb
49
+ - lib/exwiw/after_insert_hook.rb
47
50
  - lib/exwiw/belongs_to.rb
48
51
  - lib/exwiw/cli.rb
49
52
  - lib/exwiw/ddl_postprocessor.rb
50
53
  - lib/exwiw/determine_table_processing_order.rb
51
54
  - lib/exwiw/embedded_in.rb
55
+ - lib/exwiw/explain_runner.rb
52
56
  - lib/exwiw/mongo_query.rb
53
57
  - lib/exwiw/mongodb_collection_config.rb
54
58
  - lib/exwiw/mongodb_field.rb