exwiw 0.2.0 → 0.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +7 -0
- data/README.md +44 -42
- data/docs/plans/2026-05-22-postgres-copy-mode-scenario-test.md +91 -0
- data/lib/exwiw/adapter/mongodb_adapter.rb +4 -0
- data/lib/exwiw/adapter/mysql2_adapter.rb +11 -0
- data/lib/exwiw/adapter/postgresql_adapter.rb +7 -0
- data/lib/exwiw/adapter/sqlite3_adapter.rb +8 -0
- data/lib/exwiw/adapter.rb +7 -0
- data/lib/exwiw/cli.rb +91 -32
- data/lib/exwiw/explain_runner.rb +83 -0
- data/lib/exwiw/mongodb_collection_config.rb +1 -0
- data/lib/exwiw/runner.rb +28 -0
- data/lib/exwiw/table_config.rb +2 -0
- data/lib/exwiw/version.rb +1 -1
- data/lib/exwiw.rb +1 -0
- metadata +3 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 4855b3fc49afc6cc69606579eb15c0cf430c4123024ec8e9b26a5215989292d3
|
|
4
|
+
data.tar.gz: 1ee99545e00cb43292c59b918dc24d3b4f764427f8708fca773c0b507f6c1e2d
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: fcbe518ae634e294bd48e61088dd4bd0f1012b317b08d921bcbd7ee95e4e595c69c9df59ad18593dbcee35b017c1fa6a198102f27757183b804218920ff64cbb
|
|
7
|
+
data.tar.gz: 48475293bc58a6f32ec9edc68c3bdc0bdd1427233ec4cd2366fb656e5fdd1e050a260056fc69e73c5735eeb8cb6b457ef7244a13589946430f138359cc050911
|
data/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,13 @@
|
|
|
2
2
|
|
|
3
3
|
## [Unreleased]
|
|
4
4
|
|
|
5
|
+
## [0.2.1] - 2026-05-23
|
|
6
|
+
|
|
7
|
+
### Added
|
|
8
|
+
|
|
9
|
+
- `skip: true` table config attribute to explicitly exclude a table from the dump. Skipped tables produce no schema entry, no `insert-*` file, and no `delete-*` file. Using a skipped table as `--target-table`, or having another non-skipped table reference it via `belongs_to`, raises `ArgumentError` on load. Available for both SQL adapters (`TableConfig`) and the MongoDB adapter (`MongodbCollectionConfig`). ([#26](https://github.com/heyinc/exwiw/pull/26))
|
|
10
|
+
- `dump` / `explain` subcommands. `dump` is the default and preserves the existing behavior when no subcommand is given. `explain` prints the compiled SQL and its `EXPLAIN` output (estimate-only — `EXPLAIN QUERY PLAN` on SQLite) for each extraction query to stdout without executing the SELECTs. Supported for `mysql2`, `postgresql`, and `sqlite3`; `mongodb` is not yet supported. ([#28](https://github.com/heyinc/exwiw/pull/28))
|
|
11
|
+
|
|
5
12
|
## [0.2.0] - 2026-05-22
|
|
6
13
|
|
|
7
14
|
### Added
|
data/README.md
CHANGED
|
@@ -44,7 +44,12 @@ gem install exwiw
|
|
|
44
44
|
|
|
45
45
|
## Usage
|
|
46
46
|
|
|
47
|
-
|
|
47
|
+
exwiw has two subcommands:
|
|
48
|
+
|
|
49
|
+
- `dump` (default) — generate INSERT/COPY SQL files. This is the existing behavior; if the subcommand is omitted, `dump` is assumed for backwards compatibility.
|
|
50
|
+
- `explain` — print the compiled SQL and its `EXPLAIN` output for each query that `dump` would run, without executing the SELECTs.
|
|
51
|
+
|
|
52
|
+
### `exwiw dump`
|
|
48
53
|
|
|
49
54
|
```bash
|
|
50
55
|
# dump & masking all records from database to dump.sql based on schema.json
|
|
@@ -92,6 +97,22 @@ you need to delete the records before importing the dump,
|
|
|
92
97
|
This sql will delete "all" related records to the extract targets.
|
|
93
98
|
idx meaning is the same as insert sql.
|
|
94
99
|
|
|
100
|
+
### `exwiw explain`
|
|
101
|
+
|
|
102
|
+
Print the compiled SQL and its `EXPLAIN` output (estimate-only; `EXPLAIN QUERY PLAN` on SQLite) for each query that `dump` would run, to stdout. No SELECT is executed. Supported for `mysql2`, `postgresql`, and `sqlite3`. The `mongodb` adapter is not yet supported.
|
|
103
|
+
|
|
104
|
+
```bash
|
|
105
|
+
# preview the queries exwiw would run, without executing the SELECTs
|
|
106
|
+
exwiw explain \
|
|
107
|
+
--adapter=postgresql \
|
|
108
|
+
--host=localhost --port=5432 --user=reader \
|
|
109
|
+
--database=app_production \
|
|
110
|
+
--config-dir=exwiw \
|
|
111
|
+
--target-table=shops --ids=1
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
The `--output-dir`, `--output-format`, `--insert-only`, and `--after-insert-hook` options are dump-specific and rejected when used with `explain`.
|
|
115
|
+
|
|
95
116
|
### Generator
|
|
96
117
|
|
|
97
118
|
The config generator is provided as a Rake task.
|
|
@@ -136,21 +157,7 @@ This is an example of the one table schema:
|
|
|
136
157
|
|
|
137
158
|
### Output format
|
|
138
159
|
|
|
139
|
-
By default, exwiw generates `INSERT` statements. For PostgreSQL, you can
|
|
140
|
-
|
|
141
|
-
```bash
|
|
142
|
-
exwiw \
|
|
143
|
-
--adapter=postgresql \
|
|
144
|
-
--host=localhost \
|
|
145
|
-
--port=5432 \
|
|
146
|
-
--user=reader \
|
|
147
|
-
--database=app_production \
|
|
148
|
-
--config-dir=exwiw \
|
|
149
|
-
--target-table=shops \
|
|
150
|
-
--ids=1 \
|
|
151
|
-
--output-dir=dump \
|
|
152
|
-
--output-format=copy
|
|
153
|
-
```
|
|
160
|
+
By default, exwiw generates `INSERT` statements. For PostgreSQL, you can pass `--output-format=copy` to generate `COPY FROM stdin` format instead, which is significantly faster for bulk loading.
|
|
154
161
|
|
|
155
162
|
The generated file uses tab-separated values with PostgreSQL's text-format escaping (`\N` for NULL, `\\` for backslash, etc.). Import with `psql`:
|
|
156
163
|
|
|
@@ -162,21 +169,7 @@ psql -d app_dev -f dump/insert-001-shops.sql
|
|
|
162
169
|
|
|
163
170
|
### Skip DELETE SQL output
|
|
164
171
|
|
|
165
|
-
By default, exwiw generates `delete-*.sql` files alongside the `insert-*.sql` files so that an existing dataset can be cleared before re-inserting. Pass `--insert-only` when you only need the insert files
|
|
166
|
-
|
|
167
|
-
```bash
|
|
168
|
-
exwiw \
|
|
169
|
-
--adapter=mysql2 \
|
|
170
|
-
--host=localhost \
|
|
171
|
-
--port=3306 \
|
|
172
|
-
--user=reader \
|
|
173
|
-
--database=app_production \
|
|
174
|
-
--config-dir=exwiw \
|
|
175
|
-
--target-table=shops \
|
|
176
|
-
--ids=1 \
|
|
177
|
-
--output-dir=dump \
|
|
178
|
-
--insert-only
|
|
179
|
-
```
|
|
172
|
+
By default, exwiw generates `delete-*.sql` files alongside the `insert-*.sql` files so that an existing dataset can be cleared before re-inserting. Pass `--insert-only` when you only need the insert files.
|
|
180
173
|
|
|
181
174
|
### After-insert hook
|
|
182
175
|
|
|
@@ -198,17 +191,6 @@ insert_sql <<~SQL
|
|
|
198
191
|
SQL
|
|
199
192
|
```
|
|
200
193
|
|
|
201
|
-
Run with:
|
|
202
|
-
|
|
203
|
-
```bash
|
|
204
|
-
exwiw \
|
|
205
|
-
--adapter=mysql2 --host=localhost --port=3306 --user=reader \
|
|
206
|
-
--database=app_production --config-dir=exwiw \
|
|
207
|
-
--target-table=shops --ids=1,2 \
|
|
208
|
-
--output-dir=dump \
|
|
209
|
-
--after-insert-hook=hooks/seed_default_users.rb
|
|
210
|
-
```
|
|
211
|
-
|
|
212
194
|
**Shell hook**: anything other than `.rb` is exec'd as a child process. It is a pure side-effect hook — exwiw does not capture its stdout. The hook receives these env vars and inherits `DATABASE_PASSWORD` from the parent:
|
|
213
195
|
|
|
214
196
|
- `EXWIW_OUTPUT_DIR`, `EXWIW_CONFIG_DIR`
|
|
@@ -219,6 +201,26 @@ A non-zero exit code from the shell hook aborts exwiw.
|
|
|
219
201
|
|
|
220
202
|
Note: Ruby hooks are evaluated via `instance_eval` inside the exwiw process — only pass paths you trust.
|
|
221
203
|
|
|
204
|
+
### Skip a table
|
|
205
|
+
|
|
206
|
+
Set `"skip": true` on a table's config JSON to explicitly exclude it from the dump. The table is omitted from `insert-000-schema.{sql,js}`, and no `insert-*` / `delete-*` files are generated for it. Skipped tables are also not queried at all.
|
|
207
|
+
|
|
208
|
+
```json
|
|
209
|
+
{
|
|
210
|
+
"name": "audit_logs",
|
|
211
|
+
"primary_key": "id",
|
|
212
|
+
"skip": true,
|
|
213
|
+
"belongs_tos": [],
|
|
214
|
+
"columns": [{ "name": "id" }]
|
|
215
|
+
}
|
|
216
|
+
```
|
|
217
|
+
|
|
218
|
+
Constraints:
|
|
219
|
+
|
|
220
|
+
- If another non-skipped table has a `belongs_to` entry pointing at a skipped table, exwiw raises `ArgumentError` on load. Remove the `belongs_to` entry on the referencing table, or unset `skip` on the referenced table.
|
|
221
|
+
- Specifying a skipped table as `--target-table` raises `ArgumentError`.
|
|
222
|
+
- `skip: true` is preserved by `exwiw:schema:generate` regenerations (the receiver value wins over the auto-generated config).
|
|
223
|
+
|
|
222
224
|
### Bulk insert chunk size
|
|
223
225
|
|
|
224
226
|
`bulk_insert_chunk_size` splits the generated `INSERT` statement into multiple statements, each containing at most the specified number of rows. This is useful when the number of records per table is large enough to hit limits like MySQL's `max_allowed_packet`.
|
|
@@ -0,0 +1,91 @@
|
|
|
1
|
+
# PostgreSQL COPY モードの SQL 妥当性を検証する scenario_test 追加
|
|
2
|
+
|
|
3
|
+
## Context
|
|
4
|
+
|
|
5
|
+
`exwiw` の PostgreSQL アダプターは `--output-format=copy` で `COPY ... FROM stdin;` 形式の出力に切り替えられる(`lib/exwiw/adapter/postgresql_adapter.rb:77-85`)。既存のテストでは:
|
|
6
|
+
|
|
7
|
+
- 単体テスト(`spec/adapter/postgresql_adapter_spec.rb:256-305`)が文字列フォーマットを検証
|
|
8
|
+
- ランナー統合テスト(`spec/runner_spec.rb:236-286`)がファイル構造を検証
|
|
9
|
+
|
|
10
|
+
しかし **生成された COPY-mode SQL を実際に `psql -f` で取り込めるかを検証する end-to-end テストが存在しない**。ユーザーは COPY モードで invalid な SQL が出ているのではと疑っており、それを実DBに対して検証したい。
|
|
11
|
+
|
|
12
|
+
既存の INSERT モードは `scenario/test_with_postgresql.sh` が `psql -f` での再取込まで含めて検証している。これに対応する COPY モード版が無い状態。
|
|
13
|
+
|
|
14
|
+
ゴール: COPY モード出力を実際に psql に食わせる E2E シナリオ + スナップショット回帰テストを追加し、潜在的な invalid SQL を表面化する。
|
|
15
|
+
|
|
16
|
+
## 変更ファイル
|
|
17
|
+
|
|
18
|
+
1. **新規** `scenario/test_with_postgresql_copy.sh` — E2E シェル
|
|
19
|
+
2. **修正** `spec/insert_output_snapshot_spec.rb` — COPY 用の SCENARIOS エントリと `snapshot_subdir` 対応
|
|
20
|
+
3. **修正** `.github/workflows/scenario.yml` — `with_postgres` ジョブに新ステップ
|
|
21
|
+
4. **新規** `spec/insert_output_snapshots/postgresql-copy/insert-*.sql` — `UPDATE_SNAPSHOTS=1` で自動生成
|
|
22
|
+
|
|
23
|
+
## 詳細
|
|
24
|
+
|
|
25
|
+
### 1. `scenario/test_with_postgresql_copy.sh`
|
|
26
|
+
|
|
27
|
+
`scenario/test_with_postgresql.sh` を雛形にして以下のみ差し替え:
|
|
28
|
+
|
|
29
|
+
- `FROM_DATABASE_NAME="exwiw_scenario_prod_db_copy"`
|
|
30
|
+
- `TO_DATABASE_NAME="exwiw_scenario_dev_db_copy"`(並列実行されても既存シナリオと衝突しない名前)
|
|
31
|
+
- `exe/exwiw` に `--output-format=copy` を追加
|
|
32
|
+
- `--output-dir=tmp/postgresql-copy` に変更
|
|
33
|
+
- `delete-*.sql` / `insert-*.sql` のループも `tmp/postgresql-copy/` を参照
|
|
34
|
+
|
|
35
|
+
`set -e` により、psql が COPY ブロックの構文/データエラーで終了したら即時失敗する。これがユーザーが疑う「invalid SQL」の検出ポイント。
|
|
36
|
+
|
|
37
|
+
末尾の検証(`INSERT INTO shops ... ` がオートインクリメントで通るか)はそのまま流用 — `to_copy_from_stdin` の後ろに付く `post_insert_sql`(sequence の setval)まで含めて検証される。
|
|
38
|
+
|
|
39
|
+
実行権限 `chmod +x` を付与(兄弟スクリプトに合わせる)。
|
|
40
|
+
|
|
41
|
+
### 2. `spec/insert_output_snapshot_spec.rb`
|
|
42
|
+
|
|
43
|
+
- 78 行目の `snapshot_dir` を `scenario[:snapshot_subdir] || scenario[:adapter]` に変更し、同一 adapter で複数シナリオを持てるようにする
|
|
44
|
+
- 76 行目の context ラベルに `output_format` がある場合のサフィックスを足して、rspec 出力で区別可能にする
|
|
45
|
+
- `SCENARIOS` 配列に以下を追加:
|
|
46
|
+
|
|
47
|
+
```ruby
|
|
48
|
+
{
|
|
49
|
+
adapter: "postgresql",
|
|
50
|
+
config_dir: "scenario/postgresql-schema",
|
|
51
|
+
output_format: "copy",
|
|
52
|
+
snapshot_subdir: "postgresql-copy",
|
|
53
|
+
connection: { adapter: "postgresql", database_name: "exwiw_test",
|
|
54
|
+
host: "127.0.0.1", port: 5432,
|
|
55
|
+
user: "postgres", password: "test_password" },
|
|
56
|
+
},
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
86 行目の `scenario.fetch(:output_format, "insert")` は既に存在するので追加変更不要。`insert-000-schema.sql` の pg_dump 正規化(21-25 行)も同 adapter なのでそのまま効く。
|
|
60
|
+
|
|
61
|
+
### 3. `.github/workflows/scenario.yml`
|
|
62
|
+
|
|
63
|
+
`with_postgres` ジョブの「Run exwiw (from clean target DB)」ステップ(115 行目)の後に追加:
|
|
64
|
+
|
|
65
|
+
```yaml
|
|
66
|
+
- name: Run exwiw (copy mode)
|
|
67
|
+
run: scenario/test_with_postgresql_copy.sh
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
`postgres:17-alpine` サービスと `postgresql-client-17` インストールは既存ステップで完了済みなので追加不要。
|
|
71
|
+
|
|
72
|
+
### 4. スナップショット生成
|
|
73
|
+
|
|
74
|
+
```
|
|
75
|
+
UPDATE_SNAPSHOTS=1 bundle exec rspec spec/insert_output_snapshot_spec.rb
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
`spec/insert_output_snapshots/postgresql-copy/` 配下に `insert-000-schema.sql` + `insert-001-shops.sql` ... `insert-007-transactions.sql` 相当が生成される。これを git に含める。
|
|
79
|
+
|
|
80
|
+
## 検証手順
|
|
81
|
+
|
|
82
|
+
1. ローカルで `docker compose up -d postgres` を起動
|
|
83
|
+
2. `bash scenario/test_with_postgresql_copy.sh` を実行 — exit 0 ならば COPY モード SQL は psql 経由で valid。non-zero なら invalid SQL が表面化(その時点で原因を特定して別途修正)
|
|
84
|
+
3. `UPDATE_SNAPSHOTS=1 bundle exec rspec spec/insert_output_snapshot_spec.rb` でスナップショットを生成
|
|
85
|
+
4. `bundle exec rspec spec/insert_output_snapshot_spec.rb` を `UPDATE_SNAPSHOTS` 無しで再実行し、全シナリオ(sqlite3 / mysql2 / postgresql / postgresql-copy / mongodb)が通ることを確認
|
|
86
|
+
5. CI 上で `with_postgres` ジョブの新ステップ `Run exwiw (copy mode)` が通る(または invalid SQL を検出する)ことを確認
|
|
87
|
+
|
|
88
|
+
## 想定される結果の分岐
|
|
89
|
+
|
|
90
|
+
- **テストが通った場合**: ユーザーの疑いは(少なくとも seed データの範囲では)杞憂。回帰テストとして残り、今後 COPY モード周りの改修で SQL を壊した時に早期検出できる
|
|
91
|
+
- **テストが落ちた場合**: 落ち方(psql のエラーメッセージ)から原因を特定。修正は本プランの範囲外として別タスクで対応する(ユーザーに報告 → 方針決定 → 実装)
|
|
@@ -97,6 +97,10 @@ module Exwiw
|
|
|
97
97
|
raise NotImplementedError, "MongodbAdapter does not support bulk delete"
|
|
98
98
|
end
|
|
99
99
|
|
|
100
|
+
def explain(_query)
|
|
101
|
+
raise NotImplementedError, "MongodbAdapter does not support explain yet"
|
|
102
|
+
end
|
|
103
|
+
|
|
100
104
|
def output_extension
|
|
101
105
|
'jsonl'
|
|
102
106
|
end
|
|
@@ -14,6 +14,17 @@ module Exwiw
|
|
|
14
14
|
connection.query(sql, cast: false, as: :array).to_a
|
|
15
15
|
end
|
|
16
16
|
|
|
17
|
+
def explain(query_ast)
|
|
18
|
+
sql = compile_ast(query_ast)
|
|
19
|
+
|
|
20
|
+
@logger.debug(" Executing EXPLAIN: \n#{sql}")
|
|
21
|
+
rows = connection.query("EXPLAIN #{sql}", cast: false).to_a
|
|
22
|
+
rows.each_with_index.flat_map do |row, i|
|
|
23
|
+
["*************************** #{i + 1}. row ***************************"] +
|
|
24
|
+
row.map { |k, v| "#{k}: #{v}" }
|
|
25
|
+
end.join("\n")
|
|
26
|
+
end
|
|
27
|
+
|
|
17
28
|
def dump_schema(ordered_tables, output_path)
|
|
18
29
|
require 'open3'
|
|
19
30
|
|
|
@@ -14,6 +14,13 @@ module Exwiw
|
|
|
14
14
|
connection.exec(sql).values
|
|
15
15
|
end
|
|
16
16
|
|
|
17
|
+
def explain(query_ast)
|
|
18
|
+
sql = compile_ast(query_ast)
|
|
19
|
+
|
|
20
|
+
@logger.debug(" Executing EXPLAIN: \n#{sql}")
|
|
21
|
+
connection.exec("EXPLAIN #{sql}").values.map(&:first).join("\n")
|
|
22
|
+
end
|
|
23
|
+
|
|
17
24
|
def dump_schema(ordered_tables, output_path)
|
|
18
25
|
require 'open3'
|
|
19
26
|
|
|
@@ -14,6 +14,14 @@ module Exwiw
|
|
|
14
14
|
connection.execute(sql)
|
|
15
15
|
end
|
|
16
16
|
|
|
17
|
+
def explain(query_ast)
|
|
18
|
+
sql = compile_ast(query_ast)
|
|
19
|
+
|
|
20
|
+
@logger.debug(" Executing EXPLAIN QUERY PLAN: \n#{sql}")
|
|
21
|
+
rows = connection.execute("EXPLAIN QUERY PLAN #{sql}")
|
|
22
|
+
rows.map { |row| row[3] }.join("\n")
|
|
23
|
+
end
|
|
24
|
+
|
|
17
25
|
def dump_schema(ordered_tables, output_path)
|
|
18
26
|
@logger.debug(" Reading schema from sqlite_master...")
|
|
19
27
|
target_names = ordered_tables.map(&:name)
|
data/lib/exwiw/adapter.rb
CHANGED
|
@@ -74,6 +74,13 @@ module Exwiw
|
|
|
74
74
|
def to_copy_from_stdin(_results, _table)
|
|
75
75
|
raise NotImplementedError, "COPY format is not supported by #{self.class.name}"
|
|
76
76
|
end
|
|
77
|
+
|
|
78
|
+
# Run the database-specific EXPLAIN for the given query and return the
|
|
79
|
+
# output as a single string for `explain` subcommand to print.
|
|
80
|
+
# SQL adapters override; MongodbAdapter currently raises.
|
|
81
|
+
def explain(_query_ast)
|
|
82
|
+
raise NotImplementedError, "#{self.class.name} does not implement #explain"
|
|
83
|
+
end
|
|
77
84
|
end
|
|
78
85
|
|
|
79
86
|
# @params [Exwiw::QueryAst] query_ast
|
data/lib/exwiw/cli.rb
CHANGED
|
@@ -10,26 +10,36 @@ require 'exwiw'
|
|
|
10
10
|
|
|
11
11
|
module Exwiw
|
|
12
12
|
class CLI
|
|
13
|
+
KNOWN_SUBCOMMANDS = %w[dump explain].freeze
|
|
14
|
+
|
|
13
15
|
def self.start(argv)
|
|
14
16
|
new(argv).run
|
|
15
17
|
end
|
|
16
18
|
|
|
17
19
|
def initialize(argv)
|
|
18
20
|
@argv = argv.dup
|
|
19
|
-
|
|
21
|
+
|
|
22
|
+
@subcommand =
|
|
23
|
+
if !@argv.empty? && !@argv.first.start_with?("-") && KNOWN_SUBCOMMANDS.include?(@argv.first)
|
|
24
|
+
@argv.shift
|
|
25
|
+
else
|
|
26
|
+
"dump"
|
|
27
|
+
end
|
|
28
|
+
|
|
29
|
+
@help = @argv.empty?
|
|
20
30
|
|
|
21
31
|
@database_host = nil
|
|
22
32
|
@database_port = nil
|
|
23
33
|
@database_user = nil
|
|
24
34
|
@database_password = ENV["DATABASE_PASSWORD"]
|
|
25
|
-
@output_dir =
|
|
35
|
+
@output_dir = nil
|
|
26
36
|
@config_dir = nil
|
|
27
37
|
@database_adapter = nil
|
|
28
38
|
@database_name = nil
|
|
29
39
|
@target_table_name = nil
|
|
30
40
|
@ids = []
|
|
31
|
-
@output_format =
|
|
32
|
-
@insert_only =
|
|
41
|
+
@output_format = nil
|
|
42
|
+
@insert_only = nil
|
|
33
43
|
@after_insert_hook_path = nil
|
|
34
44
|
@log_level = :info
|
|
35
45
|
|
|
@@ -39,25 +49,29 @@ module Exwiw
|
|
|
39
49
|
def run
|
|
40
50
|
if @help
|
|
41
51
|
puts parser.help
|
|
42
|
-
|
|
43
|
-
|
|
52
|
+
return
|
|
53
|
+
end
|
|
44
54
|
|
|
45
|
-
|
|
46
|
-
adapter: @database_adapter,
|
|
47
|
-
host: @database_host,
|
|
48
|
-
port: @database_port,
|
|
49
|
-
user: @database_user,
|
|
50
|
-
password: @database_password,
|
|
51
|
-
database_name: @database_name,
|
|
52
|
-
)
|
|
55
|
+
validate_options!
|
|
53
56
|
|
|
54
|
-
|
|
55
|
-
|
|
56
|
-
|
|
57
|
-
|
|
57
|
+
connection_config = ConnectionConfig.new(
|
|
58
|
+
adapter: @database_adapter,
|
|
59
|
+
host: @database_host,
|
|
60
|
+
port: @database_port,
|
|
61
|
+
user: @database_user,
|
|
62
|
+
password: @database_password,
|
|
63
|
+
database_name: @database_name,
|
|
64
|
+
)
|
|
65
|
+
|
|
66
|
+
dump_target = DumpTarget.new(
|
|
67
|
+
table_name: @target_table_name,
|
|
68
|
+
ids: @ids,
|
|
69
|
+
)
|
|
58
70
|
|
|
59
|
-
|
|
71
|
+
logger = build_logger
|
|
60
72
|
|
|
73
|
+
case @subcommand
|
|
74
|
+
when "dump"
|
|
61
75
|
Runner.new(
|
|
62
76
|
connection_config: connection_config,
|
|
63
77
|
output_dir: @output_dir,
|
|
@@ -69,10 +83,22 @@ module Exwiw
|
|
|
69
83
|
cli_options: build_cli_options_hash,
|
|
70
84
|
logger: logger,
|
|
71
85
|
).run
|
|
86
|
+
when "explain"
|
|
87
|
+
ExplainRunner.new(
|
|
88
|
+
connection_config: connection_config,
|
|
89
|
+
config_dir: @config_dir,
|
|
90
|
+
dump_target: dump_target,
|
|
91
|
+
logger: logger,
|
|
92
|
+
io: $stdout,
|
|
93
|
+
).run
|
|
72
94
|
end
|
|
73
95
|
end
|
|
74
96
|
|
|
75
97
|
private def validate_options!
|
|
98
|
+
if @subcommand == "explain"
|
|
99
|
+
validate_explain_only!
|
|
100
|
+
end
|
|
101
|
+
|
|
76
102
|
if @database_adapter != "sqlite3"
|
|
77
103
|
required_options = {
|
|
78
104
|
"Target database host" => @database_host,
|
|
@@ -99,15 +125,21 @@ module Exwiw
|
|
|
99
125
|
exit 1
|
|
100
126
|
end
|
|
101
127
|
|
|
102
|
-
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
|
|
106
|
-
end
|
|
128
|
+
if @subcommand == "dump"
|
|
129
|
+
@output_dir ||= "dump"
|
|
130
|
+
@output_format ||= "insert"
|
|
131
|
+
@insert_only = @insert_only ? true : false
|
|
107
132
|
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
|
|
133
|
+
valid_output_formats = ["insert", "copy"]
|
|
134
|
+
unless valid_output_formats.include?(@output_format)
|
|
135
|
+
$stderr.puts "Invalid output format '#{@output_format}'. Available options are: #{valid_output_formats.join(', ')}"
|
|
136
|
+
exit 1
|
|
137
|
+
end
|
|
138
|
+
|
|
139
|
+
if @output_format == "copy" && @database_adapter != "postgresql"
|
|
140
|
+
$stderr.puts "--output-format=copy is only supported with the postgresql adapter"
|
|
141
|
+
exit 1
|
|
142
|
+
end
|
|
111
143
|
end
|
|
112
144
|
|
|
113
145
|
if @config_dir.nil?
|
|
@@ -149,6 +181,24 @@ module Exwiw
|
|
|
149
181
|
end
|
|
150
182
|
end
|
|
151
183
|
|
|
184
|
+
private def validate_explain_only!
|
|
185
|
+
if @database_adapter == "mongodb"
|
|
186
|
+
$stderr.puts "mongodb adapter is not yet supported by 'explain' subcommand"
|
|
187
|
+
exit 1
|
|
188
|
+
end
|
|
189
|
+
|
|
190
|
+
rejected = []
|
|
191
|
+
rejected << "--output-dir" unless @output_dir.nil?
|
|
192
|
+
rejected << "--output-format" unless @output_format.nil?
|
|
193
|
+
rejected << "--insert-only" unless @insert_only.nil?
|
|
194
|
+
rejected << "--after-insert-hook" unless @after_insert_hook_path.nil?
|
|
195
|
+
|
|
196
|
+
unless rejected.empty?
|
|
197
|
+
$stderr.puts "The following options are not applicable in 'explain' subcommand: #{rejected.join(', ')}"
|
|
198
|
+
exit 1
|
|
199
|
+
end
|
|
200
|
+
end
|
|
201
|
+
|
|
152
202
|
private def build_cli_options_hash
|
|
153
203
|
{
|
|
154
204
|
database_host: @database_host,
|
|
@@ -185,13 +235,22 @@ module Exwiw
|
|
|
185
235
|
|
|
186
236
|
private def parser
|
|
187
237
|
@parser ||= OptionParser.new do |opts|
|
|
188
|
-
opts.banner =
|
|
238
|
+
opts.banner = <<~BANNER
|
|
239
|
+
exwiw #{Exwiw::VERSION}
|
|
240
|
+
|
|
241
|
+
Usage: exwiw [SUBCOMMAND] [options]
|
|
242
|
+
|
|
243
|
+
Subcommands:
|
|
244
|
+
dump Generate INSERT/COPY SQL files (default when omitted).
|
|
245
|
+
explain Print EXPLAIN output for each extraction query to stdout.
|
|
246
|
+
(not yet supported for the mongodb adapter)
|
|
247
|
+
BANNER
|
|
189
248
|
opts.version = Exwiw::VERSION
|
|
190
249
|
|
|
191
250
|
opts.on("-h", "--host=HOST", "Target database host") { |v| @database_host = v }
|
|
192
251
|
opts.on("-p", "--port=PORT", "Target database port") { |v| @database_port = v }
|
|
193
252
|
opts.on("-u", "--user=USERNAME", "Target database user") { |v| @database_user = v }
|
|
194
|
-
opts.on("-o", "--output-dir=[DUMP_DIR_PATH]", "Output file path. default is dump/") do |v|
|
|
253
|
+
opts.on("-o", "--output-dir=[DUMP_DIR_PATH]", "Output file path. default is dump/ (dump subcommand only)") do |v|
|
|
195
254
|
v = v.end_with?("/") ? v[0..-2] : v
|
|
196
255
|
@output_dir = File.expand_path(v)
|
|
197
256
|
end
|
|
@@ -203,9 +262,9 @@ module Exwiw
|
|
|
203
262
|
opts.on("--database=DATABASE", "Target database name") { |v| @database_name = v }
|
|
204
263
|
opts.on("--target-table=[TABLE]", "Target table for extraction. If omitted, dump all tables.") { |v| @target_table_name = v }
|
|
205
264
|
opts.on("--ids=[IDS]", "Comma-separated list of identifiers. Required when --target-table is given.") { |v| @ids = v.split(',') }
|
|
206
|
-
opts.on("--output-format=[FORMAT]", "Output format: insert (default) or copy (PostgreSQL only)") { |v| @output_format = v }
|
|
207
|
-
opts.on("--insert-only", "Do not generate DELETE SQL files") { @insert_only = true }
|
|
208
|
-
opts.on("--after-insert-hook=PATH", "Path to a .rb or .sh post-processing hook executed after all insert/delete files are written") do |v|
|
|
265
|
+
opts.on("--output-format=[FORMAT]", "Output format: insert (default) or copy (PostgreSQL only, dump subcommand only)") { |v| @output_format = v }
|
|
266
|
+
opts.on("--insert-only", "Do not generate DELETE SQL files (dump subcommand only)") { @insert_only = true }
|
|
267
|
+
opts.on("--after-insert-hook=PATH", "Path to a .rb or .sh post-processing hook executed after all insert/delete files are written (dump subcommand only)") do |v|
|
|
209
268
|
@after_insert_hook_path = File.expand_path(v)
|
|
210
269
|
end
|
|
211
270
|
opts.on("--log-level=LEVEL", "Log level (debug, info). default is info") { |v| @log_level = v.to_sym }
|
|
@@ -0,0 +1,83 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module Exwiw
|
|
4
|
+
class ExplainRunner
|
|
5
|
+
def initialize(
|
|
6
|
+
connection_config:,
|
|
7
|
+
config_dir:,
|
|
8
|
+
dump_target:,
|
|
9
|
+
logger:,
|
|
10
|
+
io: $stdout
|
|
11
|
+
)
|
|
12
|
+
@connection_config = connection_config
|
|
13
|
+
@config_dir = config_dir
|
|
14
|
+
@dump_target = dump_target
|
|
15
|
+
@logger = logger
|
|
16
|
+
@io = io
|
|
17
|
+
end
|
|
18
|
+
|
|
19
|
+
def run
|
|
20
|
+
adapter = Adapter.build(@connection_config, @logger)
|
|
21
|
+
configs = load_table_config(adapter.class.table_config_class)
|
|
22
|
+
configs = reject_and_validate_skipped(configs)
|
|
23
|
+
|
|
24
|
+
table_by_name = configs.each_with_object({}) { |config, hash| hash[config.name] = config }
|
|
25
|
+
|
|
26
|
+
target = table_by_name[@dump_target.table_name]
|
|
27
|
+
adapter.validate_as_dump_target!(target) if target
|
|
28
|
+
|
|
29
|
+
@logger.debug("Determining table processing order...")
|
|
30
|
+
ordered_table_names = DetermineTableProcessingOrder.run(configs.select { |c| adapter.dumpable?(c) })
|
|
31
|
+
|
|
32
|
+
total_size = ordered_table_names.size
|
|
33
|
+
ordered_table_names.each_with_index do |table_name, idx|
|
|
34
|
+
@logger.debug("Explaining '#{table_name}'... (#{idx + 1}/#{total_size})")
|
|
35
|
+
table = table_by_name.fetch(table_name)
|
|
36
|
+
|
|
37
|
+
query_ast = adapter.build_query(table, @dump_target, table_by_name)
|
|
38
|
+
sql = adapter.compile_ast(query_ast)
|
|
39
|
+
explain_text = adapter.explain(query_ast)
|
|
40
|
+
|
|
41
|
+
@io.puts "-- [#{idx + 1}/#{total_size}] #{table_name}"
|
|
42
|
+
@io.puts sql
|
|
43
|
+
@io.puts
|
|
44
|
+
@io.puts "-- EXPLAIN:"
|
|
45
|
+
@io.puts explain_text
|
|
46
|
+
@io.puts
|
|
47
|
+
end
|
|
48
|
+
end
|
|
49
|
+
|
|
50
|
+
private def load_table_config(klass)
|
|
51
|
+
Dir[File.join(@config_dir, "*.json")].map do |file|
|
|
52
|
+
json = JSON.parse(File.read(file))
|
|
53
|
+
klass.from(json)
|
|
54
|
+
end
|
|
55
|
+
end
|
|
56
|
+
|
|
57
|
+
private def reject_and_validate_skipped(configs)
|
|
58
|
+
skipped_names = configs.select { |c| c.skip }.map(&:name).to_set
|
|
59
|
+
return configs if skipped_names.empty?
|
|
60
|
+
|
|
61
|
+
configs.each do |config|
|
|
62
|
+
next if config.skip
|
|
63
|
+
next unless config.respond_to?(:belongs_tos)
|
|
64
|
+
|
|
65
|
+
dangling = config.belongs_tos.select { |rel| skipped_names.include?(rel.table_name) }
|
|
66
|
+
next if dangling.empty?
|
|
67
|
+
|
|
68
|
+
raise ArgumentError,
|
|
69
|
+
"Table '#{config.name}' has belongs_to references to skipped table(s): " \
|
|
70
|
+
"#{dangling.map(&:table_name).join(', ')}. " \
|
|
71
|
+
"Remove the belongs_to entries or unset `skip` on the referenced table."
|
|
72
|
+
end
|
|
73
|
+
|
|
74
|
+
if @dump_target.table_name && skipped_names.include?(@dump_target.table_name)
|
|
75
|
+
raise ArgumentError,
|
|
76
|
+
"--target-table '#{@dump_target.table_name}' is marked skip:true and cannot be used as a dump target."
|
|
77
|
+
end
|
|
78
|
+
|
|
79
|
+
skipped_names.each { |n| @logger.info("Skipping table '#{n}' (skip:true)") }
|
|
80
|
+
configs.reject { |c| c.skip }
|
|
81
|
+
end
|
|
82
|
+
end
|
|
83
|
+
end
|
|
@@ -13,6 +13,7 @@ module Exwiw
|
|
|
13
13
|
attribute :belongs_tos, array(BelongsTo)
|
|
14
14
|
attribute :fields, array(MongodbField)
|
|
15
15
|
attribute :bulk_insert_chunk_size, optional(Integer), skip_serializing_if_nil: true
|
|
16
|
+
attribute :skip, Serdes::OptionalType.new(Serdes::ConcreteType.new(Boolean)), skip_serializing_if_nil: true
|
|
16
17
|
|
|
17
18
|
# Marks this config as physically embedded inside another collection's
|
|
18
19
|
# documents. When set, this config is not processed as a standalone dump
|
data/lib/exwiw/runner.rb
CHANGED
|
@@ -30,6 +30,8 @@ module Exwiw
|
|
|
30
30
|
adapter = Adapter.build(@connection_config, @logger)
|
|
31
31
|
configs = load_table_config(adapter.class.table_config_class)
|
|
32
32
|
|
|
33
|
+
configs = reject_and_validate_skipped(configs)
|
|
34
|
+
|
|
33
35
|
table_by_name = configs.each_with_object({}) { |config, hash| hash[config.name] = config }
|
|
34
36
|
|
|
35
37
|
target = table_by_name[@dump_target.table_name]
|
|
@@ -120,5 +122,31 @@ module Exwiw
|
|
|
120
122
|
klass.from(json)
|
|
121
123
|
end
|
|
122
124
|
end
|
|
125
|
+
|
|
126
|
+
private def reject_and_validate_skipped(configs)
|
|
127
|
+
skipped_names = configs.select { |c| c.skip }.map(&:name).to_set
|
|
128
|
+
return configs if skipped_names.empty?
|
|
129
|
+
|
|
130
|
+
configs.each do |config|
|
|
131
|
+
next if config.skip
|
|
132
|
+
next unless config.respond_to?(:belongs_tos)
|
|
133
|
+
|
|
134
|
+
dangling = config.belongs_tos.select { |rel| skipped_names.include?(rel.table_name) }
|
|
135
|
+
next if dangling.empty?
|
|
136
|
+
|
|
137
|
+
raise ArgumentError,
|
|
138
|
+
"Table '#{config.name}' has belongs_to references to skipped table(s): " \
|
|
139
|
+
"#{dangling.map(&:table_name).join(', ')}. " \
|
|
140
|
+
"Remove the belongs_to entries or unset `skip` on the referenced table."
|
|
141
|
+
end
|
|
142
|
+
|
|
143
|
+
if @dump_target.table_name && skipped_names.include?(@dump_target.table_name)
|
|
144
|
+
raise ArgumentError,
|
|
145
|
+
"--target-table '#{@dump_target.table_name}' is marked skip:true and cannot be used as a dump target."
|
|
146
|
+
end
|
|
147
|
+
|
|
148
|
+
skipped_names.each { |n| @logger.info("Skipping table '#{n}' (skip:true)") }
|
|
149
|
+
configs.reject { |c| c.skip }
|
|
150
|
+
end
|
|
123
151
|
end
|
|
124
152
|
end
|
data/lib/exwiw/table_config.rb
CHANGED
|
@@ -10,6 +10,7 @@ module Exwiw
|
|
|
10
10
|
attribute :belongs_tos, array(BelongsTo)
|
|
11
11
|
attribute :columns, array(TableColumn)
|
|
12
12
|
attribute :bulk_insert_chunk_size, optional(Integer), skip_serializing_if_nil: true
|
|
13
|
+
attribute :skip, Serdes::OptionalType.new(Serdes::ConcreteType.new(Boolean)), skip_serializing_if_nil: true
|
|
13
14
|
|
|
14
15
|
def self.from_symbol_keys(hash)
|
|
15
16
|
from(JSON.parse(hash.to_json))
|
|
@@ -76,6 +77,7 @@ module Exwiw
|
|
|
76
77
|
merged_table.filter = filter
|
|
77
78
|
merged_table.belongs_tos = passed_table.belongs_tos
|
|
78
79
|
merged_table.bulk_insert_chunk_size = passed_table.bulk_insert_chunk_size
|
|
80
|
+
merged_table.skip = skip
|
|
79
81
|
|
|
80
82
|
receiver_column_by_name = columns.each_with_object({}) { |column, hash| hash[column.name] = column }
|
|
81
83
|
|
data/lib/exwiw/version.rb
CHANGED
data/lib/exwiw.rb
CHANGED
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: exwiw
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.2.
|
|
4
|
+
version: 0.2.1
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Shia
|
|
@@ -38,6 +38,7 @@ files:
|
|
|
38
38
|
- docs/plans/2026-05-15-insert-000-schema-file.md
|
|
39
39
|
- docs/plans/2026-05-16-mongodb-from-clean-scenario.md
|
|
40
40
|
- docs/plans/2026-05-22-after-insert-hook.md
|
|
41
|
+
- docs/plans/2026-05-22-postgres-copy-mode-scenario-test.md
|
|
41
42
|
- exe/exwiw
|
|
42
43
|
- lib/exwiw.rb
|
|
43
44
|
- lib/exwiw/adapter.rb
|
|
@@ -51,6 +52,7 @@ files:
|
|
|
51
52
|
- lib/exwiw/ddl_postprocessor.rb
|
|
52
53
|
- lib/exwiw/determine_table_processing_order.rb
|
|
53
54
|
- lib/exwiw/embedded_in.rb
|
|
55
|
+
- lib/exwiw/explain_runner.rb
|
|
54
56
|
- lib/exwiw/mongo_query.rb
|
|
55
57
|
- lib/exwiw/mongodb_collection_config.rb
|
|
56
58
|
- lib/exwiw/mongodb_field.rb
|