exwiw 0.1.6 → 0.1.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: c3fe2fb0c2cff1c962899799f4cfe22981a7b730759d456dc01ab8d19254dbec
4
- data.tar.gz: 7bf651780e74074b50e9b0345a7cdcb9ec0281cd3d60ff04764cf03c1804b32e
3
+ metadata.gz: 7c5e29a492af74dfbfa0e778fcb527a218d3a33507024646b2fe88b495581c2f
4
+ data.tar.gz: bd66516a56f40e4147e76e3fc662c98cd3c0261eb54a310ef7af49bcc5373cf0
5
5
  SHA512:
6
- metadata.gz: a8ee7520812cb8a655cd4f9b8dffa3afdd622311c6b39c5b7942aae5743c3642cb44a7b6b2fdd56ffdf967a8225c220866691a8a001bf75606c70871d4d731ca
7
- data.tar.gz: 10182f0c69f0d773e1a515f932148c7b8795a51ba64e46141d0630459c815594e470ac62e4d082a695426ec254430dcac6c3cdbaff74b47e34c49f0dfa9a901a
6
+ metadata.gz: 1b63d52ce0abd624695b73d782c64a5d4dd861e701f7c5989e977dd506d0178f4e2d394e9ae57e1106bf83898b622bd93681c60bdfbbde7dcd72bf1796cd4847
7
+ data.tar.gz: 36ea50859424c45eb3bd83ffc84fb148eb4080345e7dfc012ff81b32003fa27e83f43324c4e7fa296c4d35ee8de6f89754a9e9f1449bd69d3d1b4066015e51bc
data/CHANGELOG.md CHANGED
@@ -2,6 +2,26 @@
2
2
 
3
3
  ## [Unreleased]
4
4
 
5
+ ## [0.1.8] - 2026-05-16
6
+
7
+ ### Added
8
+
9
+ - Emit a leading `insert-000-schema.{sql,js}` file alongside the per-table `insert-*` files so the generated dump can be applied to an empty database in one go. ([#14](https://github.com/heyinc/exwiw/pull/14))
10
+ - SQL adapters (`mysql2`, `postgresql`, `sqlite3`) write idempotent `CREATE TABLE IF NOT EXISTS` (and `CREATE INDEX IF NOT EXISTS` where the engine supports it) by shelling out to `mysqldump` / `pg_dump` / reading `sqlite_master`. PostgreSQL `ALTER TABLE ... ADD CONSTRAINT` is wrapped in a `DO $$ EXCEPTION WHEN duplicate_object` block.
11
+ - MongoDB adapter writes `insert-000-schema.js` containing `db.createCollection(...)` (wrapped in `try/catch` for `NamespaceExists`) and `db.<col>.createIndex(...)` calls for every top-level collection. Apply with `mongosh < dump/insert-000-schema.js`.
12
+
13
+ ### Fixed
14
+
15
+ - PostgreSQL adapter now appends a `setval` for each table's sequence at the end of the `insert-*.sql` file, transcribing the source DB's `last_value` so `nextval` after restore does not collide with imported IDs. ([#19](https://github.com/heyinc/exwiw/pull/19))
16
+
17
+ ## [0.1.7] - 2026-05-14
18
+
19
+ ### Added
20
+
21
+ - Add embedded document support to the MongoDB adapter via `embedded_in: { collection_name, path }`. Embedded configs are not dumped as their own jsonl; their `replace_with` rules apply to subdocuments (Array or Hash, with multi-level nesting) inside the parent collection.
22
+
23
+ ## [0.1.6] - 2026-03-14
24
+
5
25
  ### Added
6
26
 
7
27
  - Add `bulk_insert_chunk_size` table config to split the generated `INSERT` statement into chunks of the specified size. ([#8](https://github.com/riseshia/exwiw/pull/8))
data/README.md CHANGED
@@ -64,12 +64,15 @@ exwiw \
64
64
 
65
65
  This command will generate sql files in the `dump` directory.
66
66
 
67
+ - `dump/insert-000-schema.sql` — idempotent `CREATE TABLE IF NOT EXISTS ...` for every table in scope. Apply this first to provision an empty database.
67
68
  - `dump/insert-{idx}-{table_name}.sql`
68
69
  - `dump/delete-{idx}-{table_name}.sql`
69
70
 
70
71
  idx means the order of the dump. bigger idx might depend on smaller idx,
71
72
  so you should import the dump in order.
72
73
 
74
+ `insert-000-schema.sql` is generated by shelling out to the database client tools (`mysqldump` for `mysql2`, `pg_dump` for `postgresql`, and the sqlite3 driver for `sqlite3`), so the corresponding client must be available on PATH when running exwiw. The output is post-processed to make it idempotent: `CREATE TABLE IF NOT EXISTS`, `CREATE INDEX IF NOT EXISTS` (where the engine supports it), and PostgreSQL's `ALTER TABLE ... ADD CONSTRAINT` statements are wrapped in `DO $$ ... EXCEPTION WHEN duplicate_object`.
75
+
73
76
  you need to delete the records before importing the dump,
74
77
  `delete-{idx}-{table_name}.sql` will help you to do that.
75
78
  This sql will delete "all" related records to the extract targets.
@@ -182,10 +185,52 @@ The MongoDB adapter is experimental. To use it:
182
185
  ```bash
183
186
  mongoimport --db app_dev --collection users --file dump/insert-002-users.jsonl
184
187
  ```
188
+ - The leading `dump/insert-000-schema.js` contains `db.createCollection(...)` and `db.<col>.createIndex(...)` calls for every top-level collection (indexes are introspected from the source via `listIndexes`; the auto-created `_id_` index is skipped). Apply it with mongosh **before** running `mongoimport`:
189
+ ```bash
190
+ mongosh "mongodb://localhost/app_dev" dump/insert-000-schema.js
191
+ ```
185
192
  - Unlike SQL adapters, the MongoDB adapter does not emit `delete-*.jsonl` files (drop the database / collection yourself before importing if needed).
186
193
  - `raw_sql` is not supported (the `MongodbField` schema does not declare it; any `raw_sql` keys in scenario JSON are silently dropped on load). Use `replace_with` for masking.
187
194
  - The MongoDB adapter does not support the collection-level `filter` field (it raises `NotImplementedError` if set, since the SQL-string filter cannot be applied to MongoDB).
188
195
 
196
+ #### Embedded documents
197
+
198
+ MongoDB models often store one-to-many relationships as embedded subdocument arrays (e.g. `users` documents with a `posts: [...]` field). To mask fields inside embedded subdocuments, declare a separate config with `embedded_in`:
199
+
200
+ ```jsonc
201
+ // scenario/users.json — top-level collection
202
+ {
203
+ "name": "users",
204
+ "primary_key": "_id",
205
+ "belongs_tos": [{ "table_name": "shops", "foreign_key": "shop_id" }],
206
+ "fields": [
207
+ { "name": "_id" },
208
+ { "name": "name", "replace_with": "masked{_id}" },
209
+ { "name": "shop_id" }
210
+ ]
211
+ }
212
+
213
+ // scenario/posts.json — embedded under users.posts
214
+ {
215
+ "name": "posts",
216
+ "primary_key": "_id",
217
+ "embedded_in": { "collection_name": "users", "path": "posts" },
218
+ "belongs_tos": [],
219
+ "fields": [
220
+ { "name": "_id" },
221
+ { "name": "title", "replace_with": "masked-{_id}" }
222
+ ]
223
+ }
224
+ ```
225
+
226
+ At runtime:
227
+
228
+ - `posts` is **not** dumped as its own jsonl file. Its `replace_with` rules are applied to the subdocuments inside the parent `users` document at the path `posts`.
229
+ - `path` accepts dot-separated paths for nested fields (e.g. `"profile.contacts"`).
230
+ - Both arrays of subdocuments and a single Hash subdocument at `path` are supported. Multiple levels of nesting work via embedded chains.
231
+ - Cross-collection references from inside an embedded subdocument (`belongs_tos` on an embedded config) are not supported and raise `ArgumentError` on load.
232
+ - Specifying an embedded config as `--target-table` raises `NotImplementedError`; pass the top-level collection name instead.
233
+
189
234
  ## How it works
190
235
 
191
236
  - Load the table information from the specified config file.
@@ -0,0 +1,151 @@
1
+ # Plan: `insert-000-schema.{sql,js}` を dump 出力に追加する
2
+
3
+ ## Context
4
+
5
+ 現在 `exwiw` の dump 出力 (`dump/`) には `insert-NNN-{table}.sql` と `delete-NNN-{table}.sql` のみが書かれ、`CREATE TABLE` などのスキーマ定義は別管理になっている。そのため取得した dump を別環境にインポートしようとすると、利用者側で別途スキーマを用意する必要があり、空 DB に流すと失敗する/idempotent に流せない。
6
+
7
+ ゴール: 既存の per-table insert ファイルの前段として、
8
+
9
+ - **SQL adapters** (`mysql2` / `postgresql` / `sqlite3`): `insert-000-schema.sql` に `CREATE TABLE IF NOT EXISTS ...` などをまとめて出力する
10
+ - **MongoDB adapter**: `insert-000-schema.js` に `db.createCollection(...)` と `db.<col>.createIndex(...)` をまとめて出力する
11
+
12
+ これにより、空 DB に対しても `insert-000-schema.*` → `insert-001-...` → ... の順で適用するだけで dump が完結する。
13
+
14
+ ## Design
15
+
16
+ ### 採用した方針 (ユーザ確認済み)
17
+ - DDL の取得元は **外部コマンドのシェルアウト** (SQL 系は `mysqldump` / `pg_dump` / `sqlite3 .schema`)。MongoDB は CLI に相当するものがないので、既に require 済みの `mongo` Ruby ドライバで `listCollections` / `listIndexes` を使う (mongosh を追加要件にしない)。
18
+ - 出力は **1 ファイルに統合**: `insert-000-schema.{sql,js}`
19
+ - スコープは **DB 上の完全定義をそのまま転写**。config の `columns` / `fields` で絞り込まない。
20
+ - Mongo は **mongosh で流せる .js** を出力。
21
+
22
+ ### アーキテクチャ
23
+ 1. `Adapter::Base` に新規メソッドを追加 (デフォルトは no-op):
24
+ - `dump_schema(ordered_tables, output_path, logger)` — `output_path` (ファイル絶対パス) にスキーマ DDL を書く
25
+ - `schema_output_extension` — デフォルト `'sql'`、Mongo は override で `'js'`
26
+ 2. `Runner#run` の `mkdir_p` 直後、per-table ループの前で 1 回だけ呼び出す:
27
+ ```ruby
28
+ ordered_tables = ordered_table_names.map { |n| table_by_name.fetch(n) }
29
+ schema_path = File.join(@output_dir, "insert-000-schema.#{adapter.schema_output_extension}")
30
+ adapter.dump_schema(ordered_tables, schema_path, @logger)
31
+ ```
32
+ - `ordered_table_names` は依存順に並んでいる (`DetermineTableProcessingOrder`) ので、FK 制約付き DDL もそのまま流せる。
33
+ 3. 各 adapter の `dump_schema` 実装:
34
+
35
+ **Sqlite3Adapter** — 既に require 済みの `sqlite3` gem の connection を使い `SELECT type, sql FROM sqlite_master WHERE sql IS NOT NULL ORDER BY CASE type WHEN 'table' THEN 0 WHEN 'index' THEN 1 ELSE 2 END` を流す。`sqlite_sequence` 等の自動テーブル (`sql` が NULL) は除外される。各行の `sql` を後処理して:
36
+ - `CREATE TABLE ` → `CREATE TABLE IF NOT EXISTS `
37
+ - `CREATE INDEX ` → `CREATE INDEX IF NOT EXISTS ` (SQLite はサポート)
38
+ - `CREATE UNIQUE INDEX ` → 同上
39
+ - `CREATE TRIGGER ` / `CREATE VIEW ` も同様に IF NOT EXISTS を付与
40
+
41
+ 設計上の選択: 「シェルアウト」をユーザは選んだが、SQLite については既存接続を再利用する方が安全 (外部 `sqlite3` バイナリの有無に依存しない)。シェルアウトに統一したい場合はオプションBへ。
42
+
43
+ **Mysql2Adapter** — `mysqldump` をシェルアウト:
44
+ ```
45
+ MYSQL_PWD=$password mysqldump \
46
+ --host={host} --port={port} --user={user} \
47
+ --no-data --skip-add-drop-table --skip-comments \
48
+ --skip-extended-insert --skip-set-charset \
49
+ --compact \
50
+ {database}
51
+ ```
52
+ - パスワードはコマンドライン引数ではなく `MYSQL_PWD` env 経由 (プロセス一覧に出さない)
53
+ - stdout を後処理して `CREATE TABLE ` → `CREATE TABLE IF NOT EXISTS `。MySQL は `CREATE INDEX IF NOT EXISTS` を **サポートしない** が、mysqldump はインデックスを `CREATE TABLE` 内にインラインで含めるため通常問題なし
54
+ - `/*!40101 ...*/` のような実行時 SET 文はそのまま残す (mysqldump が import 互換のために出すもの)
55
+
56
+ **PostgresqlAdapter** — `pg_dump` をシェルアウト:
57
+ ```
58
+ PGPASSWORD=$password pg_dump \
59
+ --host={host} --port={port} --username={user} \
60
+ --schema-only --no-owner --no-acl --no-comments \
61
+ {database}
62
+ ```
63
+ - stdout を後処理して `CREATE TABLE ` → `CREATE TABLE IF NOT EXISTS `、`CREATE INDEX ` → `CREATE INDEX IF NOT EXISTS `、`CREATE UNIQUE INDEX ` → `CREATE UNIQUE INDEX ... IF NOT EXISTS` (PG は両方サポート)
64
+ - `ALTER TABLE ... ADD CONSTRAINT` は重複適用で失敗する。**`information_schema.table_constraints` で存在チェックする `DO $$` ブロックにラップする** か、よりシンプルには `ALTER TABLE ONLY ... ADD CONSTRAINT` 行を検出して `IF NOT EXISTS` 版 (PG 9.6+ の `ALTER TABLE ADD CONSTRAINT` には IF NOT EXISTS はないので) **DO ブロックでラップ** する。実装方針: 各 `ALTER TABLE ... ADD CONSTRAINT "name" ...;` を:
65
+ ```sql
66
+ DO $$ BEGIN
67
+ ALTER TABLE ... ADD CONSTRAINT "name" ...;
68
+ EXCEPTION WHEN duplicate_object THEN NULL; END $$;
69
+ ```
70
+ でラップ。
71
+ - `SET ...` / `SELECT pg_catalog.set_config(...)` 行はそのまま残す。
72
+ - `CREATE SCHEMA public` は素の pg_dump では出ないが、出力に含まれた場合は `CREATE SCHEMA IF NOT EXISTS` に書き換える。
73
+
74
+ **MongodbAdapter** — 既に require 済みの `mongo` ドライバを使い、`tables.reject(&:embedded?)` をループ:
75
+ ```ruby
76
+ db.list_collections.each { |c| existing_collections << c['name'] } # for skip-creation idempotency hint
77
+ ordered_tables.reject(&:embedded?).each do |config|
78
+ indexes = db[config.name].indexes.to_a.reject { |idx| idx['name'] == '_id_' }
79
+ # emit JS lines
80
+ end
81
+ ```
82
+ 出力 JS のサンプル:
83
+ ```js
84
+ // Auto-generated by exwiw. Apply with: mongosh < insert-000-schema.js
85
+ try { db.createCollection("users"); } catch (e) { if (e.code !== 48) throw e; } // 48 = NamespaceExists
86
+ db.users.createIndex({"shop_id":1}, {"name":"index_users_on_shop_id"});
87
+ try { db.createCollection("shops"); } catch (e) { if (e.code !== 48) throw e; }
88
+ ```
89
+ - `_id_` index は MongoDB が自動作成するので除外
90
+ - `createIndex` はキー仕様とオプションが一致していれば idempotent (重複作成は no-op)
91
+ - index doc には `v` / `ns` など driver 由来のメタが入るので、`key` と `name` 以外で出力に含めるのは `unique` / `sparse` / `partialFilterExpression` / `expireAfterSeconds` / `collation` のみ (allowlist) にする
92
+ - `MongodbAdapter#schema_output_extension` で `'js'` を返す
93
+ - 埋め込み (embedded_in) は親 collection に内包されるので index 不要 → スキップ
94
+
95
+ ### 後処理ユーティリティ
96
+ 新規ファイル `lib/exwiw/ddl_postprocessor.rb` を作って各 SQL adapter から呼ぶ:
97
+ - `.add_if_not_exists_to_create_table(sql)` — 行頭の `CREATE TABLE ` → `CREATE TABLE IF NOT EXISTS ` (既に IF NOT EXISTS が付いている場合はスキップ)
98
+ - `.add_if_not_exists_to_create_index(sql)` — `CREATE [UNIQUE] INDEX ` 系
99
+ - `.wrap_add_constraint_in_do_block(sql)` — PG 専用
100
+
101
+ ### CLI 側の挙動
102
+ 特に CLI フラグの追加はしない。常に `insert-000-schema.*` を出力する (ユーザの選択にあった通り)。ただし、外部コマンドが PATH に無い場合は明確なエラーメッセージで停止する:
103
+ ```
104
+ Error: `pg_dump` not found in PATH. exwiw needs pg_dump to generate insert-000-schema.sql for the postgresql adapter.
105
+ ```
106
+
107
+ ## Files to modify / add
108
+
109
+ | パス | 変更 |
110
+ |---|---|
111
+ | `lib/exwiw/adapter.rb` | `Base` に `dump_schema(tables, path, logger)` と `schema_output_extension` を追加 |
112
+ | `lib/exwiw/runner.rb` | `mkdir_p` 直後で `adapter.dump_schema(...)` を呼ぶ |
113
+ | `lib/exwiw/adapter/sqlite3_adapter.rb` | `dump_schema` 実装 (既存 connection を使い `sqlite_master` 経由) |
114
+ | `lib/exwiw/adapter/mysql2_adapter.rb` | `dump_schema` 実装 (`mysqldump` シェルアウト) |
115
+ | `lib/exwiw/adapter/postgresql_adapter.rb` | `dump_schema` 実装 (`pg_dump` シェルアウト, ADD CONSTRAINT DO ブロック化) |
116
+ | `lib/exwiw/adapter/mongodb_adapter.rb` | `dump_schema` 実装、`schema_output_extension` を override |
117
+ | `lib/exwiw/ddl_postprocessor.rb` (新規) | `IF NOT EXISTS` 書き換え / DO ブロックラップ |
118
+ | `lib/exwiw.rb` | 新規ファイルの require |
119
+ | `README.md` | `dump/` の出力に `insert-000-schema.{sql,js}` を追記、import 手順を更新 |
120
+ | `spec/adapter/sqlite3_adapter_spec.rb` | `dump_schema` 統合テスト (`scenario/initdb/init.sqlite3` に対して実行し、出力が `CREATE TABLE IF NOT EXISTS` を含むことを assert) |
121
+ | `spec/adapter/mongodb_adapter_spec.rb` | `dump_schema` テスト (db スタブで `listIndexes` を返し、出力 JS を assert) |
122
+ | `spec/runner_spec.rb` | `insert-000-schema.sql` が `output_dir` に書かれることを assert (Sqlite3 経由で実際に流れることを確認) |
123
+
124
+ ## 再利用する既存コード
125
+ - `lib/exwiw/connection_config.rb` (host / port / user / password / database_name) — シェルアウトの引数組み立てに使う
126
+ - `lib/exwiw/determine_table_processing_order.rb` — schema dump も依存順で並べるためそのまま使う
127
+ - `MongodbCollectionConfig#embedded?` (lib/exwiw/mongodb_collection_config.rb:33) — 埋め込み collection をスキップ
128
+ - `MongodbAdapter#db` (lib/exwiw/adapter/mongodb_adapter.rb:197) — Mongo クライアントを取り出す既存 lazy getter
129
+
130
+ ## Verification
131
+
132
+ ### ユニット / 統合テスト
133
+ 1. `bundle exec rspec spec/runner_spec.rb spec/adapter/sqlite3_adapter_spec.rb` — sqlite3 経路で `insert-000-schema.sql` が生成され、内容に `CREATE TABLE IF NOT EXISTS "shops"` が含まれることを確認。
134
+ 2. `bundle exec rspec spec/adapter/mongodb_adapter_spec.rb` — mongo クライアントをスタブして JS 出力に `db.createCollection("users")` と該当 collection の `createIndex(...)` が含まれることを確認。
135
+
136
+ ### E2E (scenario スクリプト経由)
137
+ 3. `scenario/test_with_sqlite3.sh` を実行し、`dump/insert-000-schema.sql` が生成されることと、空 DB に対して `sqlite3 empty.db < dump/insert-000-schema.sql && for f in dump/insert-*.sql; do sqlite3 empty.db < $f; done` が成功することを確認する。
138
+ 4. `scenario/test_with_mysql2.sh`, `scenario/test_with_postgresql.sh` も同様に、`mysql empty_db < dump/insert-000-schema.sql` / `psql empty_db -f dump/insert-000-schema.sql` が成功 → 続けて insert ファイル群が流せることを確認。**`mysqldump` / `pg_dump` を docker compose のコンテナ内 (`compose.yml` で起動する DB コンテナ) で実行する必要がある場合は、scenario スクリプトを更新する。**
139
+ 5. `scenario/test_with_mongodb.sh` を実行し、`dump/insert-000-schema.js` が出力されることと、空 DB に対して `mongosh "mongodb://localhost/empty_db" < dump/insert-000-schema.js` が成功すること、続いて `mongoimport` で各 jsonl が流せることを確認。
140
+ 6. **idempotency 確認**: 同じ schema ファイルを 2 回流してもエラーにならないこと (`IF NOT EXISTS` / `DO $$ EXCEPTION WHEN duplicate_object` / `try/catch on createCollection` が効いている)。
141
+
142
+ ### 手動確認のチェックポイント
143
+ - `mysqldump` / `pg_dump` が PATH にない環境で実行した場合、わかりやすいエラーで止まる
144
+ - `DATABASE_PASSWORD` env が無い場合に外部コマンドが認証エラーで落ちないこと (シェルアウト時にも `MYSQL_PWD` / `PGPASSWORD` を渡す)
145
+ - `CHANGELOG.md` への追記
146
+
147
+ ## 留意点 / 既知のリスク
148
+ - **MySQL の `CREATE INDEX IF NOT EXISTS` 非対応**: mysqldump はインデックスをテーブル定義内にインラインで吐くので通常問題にならないが、`--no-create-info --no-data --routines --triggers` などのオプションで吐き分ける場合は別途対応が必要。今回はデフォルト挙動のみサポート。
149
+ - **PG の COMMENT ON / GRANT / SEQUENCE OWNED BY**: `--no-owner --no-acl --no-comments` で削減できる。SEQUENCE 自体の `CREATE SEQUENCE` は `IF NOT EXISTS` 書き換え対象に含める。
150
+ - **mongosh 依存**: 出力 JS を流すこと自体には mongosh が必要 (README に明記)。exwiw 本体はあくまで Ruby driver のみで生成するので、exwiw 実行ホストには mongosh は不要。
151
+ - **巨大 schema**: pg_dump / mysqldump の出力をメモリに乗せて後処理するので、超巨大スキーマだとメモリ使用量が増える。実用上は問題にならない見込み。
@@ -0,0 +1,76 @@
1
+ # Plan: MongoDB の `insert-000-schema.js` を scenario で end-to-end 検証する
2
+
3
+ ## Context
4
+
5
+ `lib/exwiw/adapter/mongodb_adapter.rb#dump_schema` は `insert-000-schema.js` に
6
+ `createCollection` / `createIndex` を書き出す実装を既に持っているが、scenario 側で
7
+ これを apply するパスが無く、CI でも検証できていなかった。具体的なギャップ:
8
+
9
+ 1. `scenario/setup_with_mongodb.rb` は seed を `insert_many` で流すだけで、index を一切作っていない
10
+ 2. その結果 `tmp/mongodb/insert-000-schema.js` は `createCollection` 行のみで `createIndex` が 0 行
11
+ 3. `scenario/import_with_mongodb.rb` は `insert-*.jsonl` だけを glob して処理しており、`insert-000-schema.js` を一切実行しない
12
+
13
+ sqlite3 / mysql2 / postgresql で導入済みの「from clean DB から立ち上げる」流れと
14
+ MongoDB の `insert-000-schema.js` が連動していない状態だった (issue #16)。
15
+
16
+ ## ゴール
17
+
18
+ - 空の target DB に対して `mongosh insert-000-schema.js` → `insert-*.jsonl` の順で適用する scenario を CI に乗せる
19
+ - source DB に代表的な index を作り、`dump_schema` が `createIndex` 行を実際に吐く状態を作る
20
+ - 生成された createIndex 行が mongosh で実際に通ること、target 側で index が round-trip することを検証
21
+ - 既存の snapshot test (`spec/insert_output_snapshot_spec.rb`) でも createIndex 行を固定化
22
+
23
+ ## 変更内容
24
+
25
+ ### scenario 層
26
+ | パス | 変更 |
27
+ |---|---|
28
+ | `scenario/setup_with_mongodb.rb` | seed 流し込みの後に 3 種類の代表的 index を作る (unique `shops.name` / plain `users.email` / 複合 `orders.shop_id+user_id`) |
29
+ | `scenario/import_with_mongodb.rb` | `--no-drop` と `--input-dir DIR` フラグを追加。from-clean は drop すると schema.js が作った index ごと消えてしまうため |
30
+ | `scenario/verify_with_mongodb.rb` | `--with-indexes` で target collection の index を assert (default scenario では import 時に drop されるのでスキップ) |
31
+ | `scenario/test_with_mongodb_from_clean.sh` (新規) | `mongosh dropDatabase` → exwiw 実行 → `mongosh insert-000-schema.js` → `import --no-drop --input-dir tmp/mongodb-clean` → `verify --with-indexes` |
32
+ | `.github/workflows/scenario.yml` | with_mongodb job に `mongodb-mongosh` install ステップと `test_with_mongodb_from_clean.sh` 実行ステップを追加。apt repo の codename は `jammy` 固定 (ubuntu-latest が noble に上がる前提) |
33
+
34
+ ### snapshot test 層
35
+ | パス | 変更 |
36
+ |---|---|
37
+ | `spec/support/bootstrap_databases.rb` | scenario と同じ 3 index を bootstrap で作る |
38
+ | `spec/insert_output_snapshots/mongodb/insert-000-schema.js` | 3 つの `db.getCollection(...).createIndex(...)` 行が追加される形で再生成 |
39
+
40
+ ## 設計上の判断
41
+
42
+ - **unique index は `users.email` ではなく `shops.name` に貼る**: seed の `users.email`
43
+ は `user1@example.com` が 5 shop に重複するので unique にできない。`shops.name`
44
+ ("Shop 1".."Shop 5") は seed 上一意なので unique 可。
45
+ - **既存 scenario への副作用を最小化**: `import_with_mongodb.rb` のデフォルト挙動は変えず
46
+ `--no-drop` フラグで opt-in。既存 `test_with_mongodb.sh` は無修正で動く。
47
+ - **verify を 2 用途で兼用**: `--with-indexes` 切り替えで from-clean のみ index を見る。
48
+ 既存 scenario は drop→insert で index が無くなるため index 検証はスキップ。
49
+ - **CI への mongosh install**: `mongo:7` service container には mongosh があるが、
50
+ ubuntu-latest 上の `mongosh` コマンドは別。MongoDB の apt repo (`mongodb-mongosh`
51
+ パッケージ) を入れる。codename は `jammy` 固定 (MongoDB 7.0 repo が noble を
52
+ carry していない時期があるため)。
53
+ - **snapshot fixture を indexes 入りに更新**: bootstrap_databases.rb と
54
+ setup_with_mongodb.rb で同じ index を作るので、snapshot test と scenario test の
55
+ 期待値が分岐しない。
56
+
57
+ ## Verification
58
+
59
+ - `bash scenario/test_with_mongodb.sh` 既存 scenario 維持を確認 ✓
60
+ - `bash scenario/test_with_mongodb_from_clean.sh` 新規 scenario 通過を確認
61
+ (indexes round-trip OK) ✓
62
+ - `bundle exec rspec` 全 153 examples / 0 failures ✓
63
+ - `tmp/mongodb-clean/insert-000-schema.js` を目視で確認:
64
+ ```js
65
+ db.getCollection("shops").createIndex({"name":1}, {"unique":true,"name":"idx_shops_name"});
66
+ db.getCollection("users").createIndex({"email":1}, {"name":"idx_users_email"});
67
+ db.getCollection("orders").createIndex({"shop_id":1,"user_id":1}, {"name":"idx_orders_shop_user"});
68
+ ```
69
+
70
+ ## 留意点
71
+
72
+ - `import_with_mongodb.rb` のフラグ解析は手書きの ARGV パース。引数が増えるなら
73
+ OptionParser 化を検討する余地あり (現状は 2 フラグなので過剰)。
74
+ - ubuntu-latest が将来 codename を変えても apt repo の `jammy` 指定は壊れない想定だが、
75
+ MongoDB 8.x へ移行する際は repo URL の `7.0` も更新が必要。
76
+ - 既存 issue #16 のスコープは MongoDB のみ。SQL 系の from_clean は別 PR で導入済み。
@@ -3,10 +3,9 @@
3
3
  require 'json'
4
4
 
5
5
  # NOTE: This adapter consumes MongodbCollectionConfig (`fields` instead of
6
- # the SQL adapters' `columns`). It assumes a "flat" document schema where
7
- # references between collections are expressed as scalar foreign keys
8
- # (e.g. `shop_id` on `users`); the forward fan-out strategy here cannot
9
- # follow references that live inside embedded structures.
6
+ # `columns`, plus `embedded_in`). Top-level collections are dumped as one
7
+ # jsonl per collection; configs marked `embedded_in` are not dumped on their
8
+ # own their masking rules apply to subdocuments inside the parent.
10
9
  module Exwiw
11
10
  module Adapter
12
11
  class MongodbAdapter < Base
@@ -19,8 +18,32 @@ module Exwiw
19
18
  @state = {}
20
19
  end
21
20
 
22
- def build_query(config, dump_target, _config_by_name)
21
+ def dumpable?(config)
22
+ !config.embedded?
23
+ end
24
+
25
+ def validate_as_dump_target!(config)
26
+ return unless config.embedded?
27
+
28
+ raise NotImplementedError,
29
+ "dump_target '#{config.name}' is an embedded MongodbCollectionConfig; " \
30
+ "specify a top-level collection instead."
31
+ end
32
+
33
+ def build_query(config, dump_target, config_by_name)
34
+ if config.embedded?
35
+ raise NotImplementedError,
36
+ "MongodbAdapter#build_query was called with embedded config '#{config.name}'. " \
37
+ "Embedded configs are masked through the parent collection."
38
+ end
39
+
23
40
  reject_filter!(config)
41
+ # Stash the embedded-children index for the matching to_bulk_insert call
42
+ # below. The Adapter contract does not pass config_by_name to
43
+ # to_bulk_insert (SQL adapters don't need it), so we rely on the Runner
44
+ # invariant that build_query is always called before to_bulk_insert for
45
+ # the same config.
46
+ @embedded_children_by_parent = index_embedded_children(config_by_name)
24
47
 
25
48
  filter =
26
49
  if config.name == dump_target.table_name
@@ -57,9 +80,15 @@ module Exwiw
57
80
  docs
58
81
  end
59
82
 
83
+ # NOTE: relies on @embedded_children_by_parent set by a prior build_query
84
+ # call for the same config. This implicit ordering exists because the
85
+ # Adapter contract intentionally does not thread config_by_name through
86
+ # to_bulk_insert (SQL adapters don't need it). Safe in Runner, fragile in
87
+ # tests — call build_query first.
60
88
  def to_bulk_insert(rows, config)
61
89
  rows.map do |doc|
62
90
  apply_replace_with!(doc, config)
91
+ apply_embedded_masking!(doc, config)
63
92
  JSON.generate(extended_json(doc))
64
93
  end.join("\n")
65
94
  end
@@ -72,6 +101,47 @@ module Exwiw
72
101
  'jsonl'
73
102
  end
74
103
 
104
+ def schema_output_extension
105
+ 'js'
106
+ end
107
+
108
+ # Index options copied through to the emitted createIndex call. Anything
109
+ # else (`v`, `ns`, server-internal fields) is dropped — they would either
110
+ # be rejected by createIndex or are not portable across mongod versions.
111
+ INDEX_OPTION_ALLOWLIST = %w[
112
+ unique sparse hidden expireAfterSeconds collation
113
+ partialFilterExpression wildcardProjection
114
+ ].freeze
115
+
116
+ def dump_schema(ordered_tables, output_path)
117
+ require 'json'
118
+
119
+ collections = ordered_tables.reject(&:embedded?)
120
+
121
+ File.open(output_path, 'w') do |file|
122
+ file.puts("// Auto-generated by exwiw. Apply with: mongosh \"$MONGODB_URI\" #{File.basename(output_path)}")
123
+ file.puts
124
+
125
+ collections.each do |config|
126
+ name = config.name
127
+ file.puts(%(try { db.createCollection(#{JSON.generate(name)}); } catch (e) { if (e.code !== 48) throw e; }))
128
+ end
129
+ file.puts
130
+
131
+ collections.each do |config|
132
+ name = config.name
133
+ indexes = db[name].indexes.to_a.reject { |idx| idx['name'] == '_id_' }
134
+ indexes.each do |idx|
135
+ key = idx['key']
136
+ opts = idx.slice(*INDEX_OPTION_ALLOWLIST)
137
+ opts['name'] = idx['name'] if idx['name']
138
+ file.puts(%(db.getCollection(#{JSON.generate(name)}).createIndex(#{JSON.generate(key)}, #{JSON.generate(opts)});))
139
+ end
140
+ end
141
+ end
142
+ @logger.info(" Wrote schema for #{collections.size} collection(s) to #{output_path}.")
143
+ end
144
+
75
145
  def supports_bulk_delete?
76
146
  false
77
147
  end
@@ -96,6 +166,14 @@ module Exwiw
96
166
  "collection-level `filter` is not supported by MongodbAdapter (collection: #{config.name})"
97
167
  end
98
168
 
169
+ private def index_embedded_children(config_by_name)
170
+ config_by_name.each_value.with_object({}) do |child, acc|
171
+ next unless child.embedded?
172
+
173
+ (acc[child.embedded_in.collection_name] ||= []) << child
174
+ end
175
+ end
176
+
99
177
  private def build_projection(config)
100
178
  projection = {}
101
179
  # Always include primary key so masking templates referencing it work,
@@ -104,6 +182,11 @@ module Exwiw
104
182
  config.fields.each do |field|
105
183
  projection[field.name] = 1
106
184
  end
185
+ # Pull in paths owned by configs that mark themselves embedded in this
186
+ # collection, so the masker sees the subdocuments.
187
+ embedded_children_of(config).each do |child|
188
+ projection[child.embedded_in.path] = 1
189
+ end
107
190
  projection
108
191
  end
109
192
 
@@ -118,6 +201,32 @@ module Exwiw
118
201
  end
119
202
  end
120
203
 
204
+ private def apply_embedded_masking!(doc, parent_config)
205
+ embedded_children_of(parent_config).each do |child|
206
+ walk(doc, child.embedded_in.path) do |subdoc|
207
+ apply_replace_with!(subdoc, child)
208
+ apply_embedded_masking!(subdoc, child)
209
+ end
210
+ end
211
+ end
212
+
213
+ private def embedded_children_of(parent_config)
214
+ @embedded_children_by_parent.fetch(parent_config.name, [])
215
+ end
216
+
217
+ private def walk(doc, dotted_path)
218
+ segments = dotted_path.split(".")
219
+ *prefix, last = segments
220
+ container = prefix.reduce(doc) { |acc, seg| acc.is_a?(Hash) ? acc[seg] : nil }
221
+ return unless container.is_a?(Hash)
222
+
223
+ value = container[last]
224
+ case value
225
+ when Array then value.each { |sub| yield sub if sub.is_a?(Hash) }
226
+ when Hash then yield value
227
+ end
228
+ end
229
+
121
230
  private def extended_json(doc)
122
231
  if doc.respond_to?(:as_extended_json)
123
232
  doc.as_extended_json(mode: :relaxed)
@@ -14,6 +14,57 @@ module Exwiw
14
14
  connection.query(sql, cast: false, as: :array).to_a
15
15
  end
16
16
 
17
+ def dump_schema(ordered_tables, output_path)
18
+ require 'open3'
19
+
20
+ table_names = ordered_tables.map(&:name)
21
+ if table_names.empty?
22
+ File.write(output_path, "-- Auto-generated by exwiw. No tables in scope.\n")
23
+ return
24
+ end
25
+
26
+ cmd = [
27
+ 'mysqldump',
28
+ "--host=#{@connection_config.host}",
29
+ "--port=#{@connection_config.port}",
30
+ "--user=#{@connection_config.user}",
31
+ '--no-data',
32
+ '--skip-add-drop-table',
33
+ # `--skip-comments` only suppresses the dump's header lines
34
+ # (e.g. `-- MySQL dump ...`, server version banner). Column and
35
+ # table `COMMENT '...'` clauses are emitted inline inside
36
+ # CREATE TABLE statements and are NOT affected, so this flag is
37
+ # purely about reducing noise in the generated file.
38
+ '--skip-comments',
39
+ '--skip-set-charset',
40
+ # Suppress `SET @@GLOBAL.GTID_PURGED=...` from the dump. It is intended
41
+ # for replication setup and breaks when the target already has GTIDs
42
+ # (ERROR 3546: added gtid set must not overlap with @@GLOBAL.GTID_EXECUTED).
43
+ '--set-gtid-purged=OFF',
44
+ '--compact',
45
+ @connection_config.database_name,
46
+ *table_names,
47
+ ]
48
+ env = { 'MYSQL_PWD' => @connection_config.password.to_s }
49
+
50
+ @logger.debug(" Running mysqldump for #{table_names.size} table(s)...")
51
+ stdout, stderr, status = Open3.capture3(env, *cmd)
52
+ unless status.success?
53
+ if stderr.include?('command not found') || stderr.empty?
54
+ raise "Failed to run `mysqldump`. Ensure the mysql client is installed and on PATH. stderr: #{stderr}"
55
+ end
56
+ raise "mysqldump failed (exit #{status.exitstatus}): #{stderr}"
57
+ end
58
+
59
+ idempotent = DdlPostprocessor.add_if_not_exists_to_create_table(stdout)
60
+
61
+ File.open(output_path, 'w') do |file|
62
+ file.puts("-- Auto-generated by exwiw via mysqldump. Idempotent CREATE TABLE statements for mysql.")
63
+ file.write(idempotent)
64
+ end
65
+ @logger.info(" Wrote schema for #{table_names.size} table(s) to #{output_path}.")
66
+ end
67
+
17
68
  def to_bulk_insert(results, table)
18
69
  table_name = table.name
19
70
 
@@ -14,6 +14,51 @@ module Exwiw
14
14
  connection.exec(sql).values
15
15
  end
16
16
 
17
+ def dump_schema(ordered_tables, output_path)
18
+ require 'open3'
19
+
20
+ table_names = ordered_tables.map(&:name)
21
+ if table_names.empty?
22
+ File.write(output_path, "-- Auto-generated by exwiw. No tables in scope.\n")
23
+ return
24
+ end
25
+
26
+ cmd = [
27
+ 'pg_dump',
28
+ "--host=#{@connection_config.host}",
29
+ "--port=#{@connection_config.port}",
30
+ "--username=#{@connection_config.user}",
31
+ '--schema-only',
32
+ '--no-owner',
33
+ '--no-acl',
34
+ *table_names.flat_map { |t| ['--table', t] },
35
+ @connection_config.database_name,
36
+ ]
37
+ env = { 'PGPASSWORD' => @connection_config.password.to_s }
38
+
39
+ @logger.debug(" Running pg_dump for #{table_names.size} table(s)...")
40
+ stdout, stderr, status = Open3.capture3(env, *cmd)
41
+ unless status.success?
42
+ if stderr.include?('command not found') || stderr.empty?
43
+ raise "Failed to run `pg_dump`. Ensure the postgresql client is installed and on PATH. stderr: #{stderr}"
44
+ end
45
+ raise "pg_dump failed (exit #{status.exitstatus}): #{stderr}"
46
+ end
47
+
48
+ idempotent = stdout
49
+ idempotent = DdlPostprocessor.add_if_not_exists_to_create_schema(idempotent)
50
+ idempotent = DdlPostprocessor.add_if_not_exists_to_create_sequence(idempotent)
51
+ idempotent = DdlPostprocessor.add_if_not_exists_to_create_table(idempotent)
52
+ idempotent = DdlPostprocessor.add_if_not_exists_to_create_index(idempotent)
53
+ idempotent = DdlPostprocessor.wrap_add_constraint_in_do_block(idempotent)
54
+
55
+ File.open(output_path, 'w') do |file|
56
+ file.puts("-- Auto-generated by exwiw via pg_dump. Idempotent DDL for postgresql.")
57
+ file.write(idempotent)
58
+ end
59
+ @logger.info(" Wrote schema for #{table_names.size} table(s) to #{output_path}.")
60
+ end
61
+
17
62
  def to_bulk_insert(results, table)
18
63
  table_name = table.name
19
64
 
@@ -29,6 +74,36 @@ module Exwiw
29
74
  "INSERT INTO #{table_name} (#{column_names}) VALUES\n#{values};"
30
75
  end
31
76
 
77
+ # Transcribe the FROM-side sequence cursor backing `table.primary_key`
78
+ # onto the import target. Without this, importing into a clean DB leaves
79
+ # the sequence at 1 while the inserted rows occupy higher IDs, so the
80
+ # next default-PK INSERT collides. We query FROM's `last_value` /
81
+ # `is_called` directly (matching what pg_dump emits) rather than using
82
+ # MAX(pk), so a subsetted dump still preserves the source's "next id".
83
+ # Returns nil for non-auto-increment PKs (pg_get_serial_sequence -> NULL).
84
+ #
85
+ # Scope: ONLY the sequence attached to the primary key is synced. If a
86
+ # table has additional auto-increment columns (e.g. a non-PK SERIAL),
87
+ # those sequences are NOT transcribed and a subsequent default-value
88
+ # INSERT on them can collide. Rails-managed schemas don't hit this
89
+ # because only `id` is auto-increment, but bare PostgreSQL schemas may.
90
+ def post_insert_sql(table)
91
+ pk = table.primary_key
92
+ return nil if pk.nil? || pk.empty?
93
+
94
+ seq_name = connection
95
+ .exec_params("SELECT pg_get_serial_sequence($1, $2)", [table.name, pk])
96
+ .values.dig(0, 0)
97
+ return nil if seq_name.nil?
98
+
99
+ last_value, is_called = connection
100
+ .exec("SELECT last_value, is_called FROM #{seq_name}")
101
+ .values.first
102
+ is_called_sql = (is_called == 't' || is_called == true) ? 'true' : 'false'
103
+
104
+ "SELECT pg_catalog.setval('#{escape_single_quote(seq_name)}', #{last_value}, #{is_called_sql});"
105
+ end
106
+
32
107
  def to_bulk_delete(select_query_ast, table)
33
108
  raise NotImplementedError unless select_query_ast.is_a?(Exwiw::QueryAst::Select)
34
109
 
@@ -14,6 +14,47 @@ module Exwiw
14
14
  connection.execute(sql)
15
15
  end
16
16
 
17
+ def dump_schema(ordered_tables, output_path)
18
+ @logger.debug(" Reading schema from sqlite_master...")
19
+ target_names = ordered_tables.map(&:name)
20
+ # `sqlite_master` row order preserves table creation order, which is also
21
+ # the dependency order produced by ActiveRecord-style migrations. To respect
22
+ # the caller-provided order, we partition tables / their owned indexes by
23
+ # ordered_tables.
24
+ all = connection.execute(<<~SQL)
25
+ SELECT type, name, tbl_name, sql FROM sqlite_master
26
+ WHERE sql IS NOT NULL AND name NOT LIKE 'sqlite_%'
27
+ SQL
28
+
29
+ tables_by_name = all.select { |type, _, _, _| type == 'table' }.to_h { |_, name, _, sql| [name, sql] }
30
+ indexes_by_owner = all.select { |type, _, _, _| type == 'index' }.group_by { |_, _, tbl, _| tbl }
31
+ triggers_by_owner = all.select { |type, _, _, _| type == 'trigger' }.group_by { |_, _, tbl, _| tbl }
32
+
33
+ statements = []
34
+ target_names.each do |name|
35
+ table_sql = tables_by_name[name]
36
+ next unless table_sql
37
+
38
+ statements << finalize_stmt(DdlPostprocessor.add_if_not_exists_to_create_table(table_sql.strip))
39
+ (indexes_by_owner[name] || []).each do |_, _, _, idx_sql|
40
+ statements << finalize_stmt(DdlPostprocessor.add_if_not_exists_to_create_index(idx_sql.strip))
41
+ end
42
+ (triggers_by_owner[name] || []).each do |_, _, _, trg_sql|
43
+ statements << finalize_stmt(trg_sql.strip)
44
+ end
45
+ end
46
+
47
+ File.open(output_path, 'w') do |file|
48
+ file.puts("-- Auto-generated by exwiw. Idempotent CREATE statements for sqlite3.")
49
+ file.puts(statements.join("\n"))
50
+ end
51
+ @logger.info(" Wrote #{statements.size} schema statement(s) to #{output_path}.")
52
+ end
53
+
54
+ private def finalize_stmt(stmt)
55
+ stmt.end_with?(';') ? stmt : "#{stmt};"
56
+ end
57
+
17
58
  def to_bulk_insert(results, table)
18
59
  table_name = table.name
19
60
 
data/lib/exwiw/adapter.rb CHANGED
@@ -30,10 +30,46 @@ module Exwiw
30
30
  'sql'
31
31
  end
32
32
 
33
+ # File extension used for the leading `insert-000-schema.*` file.
34
+ # SQL adapters emit `.sql` (CREATE TABLE IF NOT EXISTS ...);
35
+ # MongodbAdapter overrides to `.js` (mongosh-runnable createCollection / createIndex).
36
+ def schema_output_extension
37
+ 'sql'
38
+ end
39
+
40
+ # Write the leading schema-creation file for this adapter to `output_path`.
41
+ # Default is a no-op; subclasses override to emit idempotent DDL so the
42
+ # generated dump can be applied to an empty database.
43
+ #
44
+ # @param ordered_tables [Array] table configs in dependency order
45
+ # @param output_path [String] absolute path to write to
46
+ def dump_schema(ordered_tables, output_path)
47
+ end
48
+
33
49
  # Whether this adapter emits delete-NNN-*.sql files.
34
50
  def supports_bulk_delete?
35
51
  true
36
52
  end
53
+
54
+ # Whether the given config produces its own dump output and needs an
55
+ # independent processing pass. SQL adapters always do; non-SQL adapters
56
+ # may exclude e.g. embedded subdocument configs.
57
+ def dumpable?(_config)
58
+ true
59
+ end
60
+
61
+ # Hook for adapter-specific validation when this config is used as the
62
+ # dump_target. Default: nothing to validate.
63
+ def validate_as_dump_target!(_config)
64
+ end
65
+
66
+ # Optional SQL appended to the per-table insert-NNN-<table>.* file after
67
+ # the bulk INSERT statements. Use to bring side-state in sync with the
68
+ # explicit IDs that were just inserted (e.g. PostgreSQL sequences).
69
+ # Default: nil (nothing appended).
70
+ def post_insert_sql(_table)
71
+ nil
72
+ end
37
73
  end
38
74
 
39
75
  # @params [Exwiw::QueryAst] query_ast
@@ -0,0 +1,61 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Exwiw
4
+ # Rewrites raw CREATE statements emitted by mysqldump / pg_dump /
5
+ # sqlite_master.sql into idempotent forms so the generated
6
+ # `insert-000-schema.sql` file can be re-applied without error.
7
+ module DdlPostprocessor
8
+ module_function
9
+
10
+ # `CREATE TABLE [name]` → `CREATE TABLE IF NOT EXISTS [name]`.
11
+ # `TEMP` / `TEMPORARY` variants and already-IF-NOT-EXISTS lines are skipped.
12
+ def add_if_not_exists_to_create_table(sql)
13
+ sql.gsub(/\bCREATE\s+TABLE\b(?!\s+IF\s+NOT\s+EXISTS)/i) do |m|
14
+ "#{m} IF NOT EXISTS"
15
+ end
16
+ end
17
+
18
+ # `CREATE [UNIQUE] INDEX [name]` → `CREATE [UNIQUE] INDEX IF NOT EXISTS [name]`.
19
+ # Use only for databases that support it (PostgreSQL, SQLite). MySQL does NOT
20
+ # support `CREATE INDEX IF NOT EXISTS` — do not call from the MySQL adapter.
21
+ def add_if_not_exists_to_create_index(sql)
22
+ sql.gsub(/\bCREATE(\s+UNIQUE)?\s+INDEX\b(?!\s+IF\s+NOT\s+EXISTS)/i) do
23
+ unique = Regexp.last_match(1) || ""
24
+ "CREATE#{unique} INDEX IF NOT EXISTS"
25
+ end
26
+ end
27
+
28
+ # `CREATE SCHEMA [name]` → `CREATE SCHEMA IF NOT EXISTS [name]`.
29
+ def add_if_not_exists_to_create_schema(sql)
30
+ sql.gsub(/\bCREATE\s+SCHEMA\b(?!\s+IF\s+NOT\s+EXISTS)/i) do |m|
31
+ "#{m} IF NOT EXISTS"
32
+ end
33
+ end
34
+
35
+ # `CREATE SEQUENCE [name]` → `CREATE SEQUENCE IF NOT EXISTS [name]`.
36
+ def add_if_not_exists_to_create_sequence(sql)
37
+ sql.gsub(/\bCREATE\s+SEQUENCE\b(?!\s+IF\s+NOT\s+EXISTS)/i) do |m|
38
+ "#{m} IF NOT EXISTS"
39
+ end
40
+ end
41
+
42
+ # `ALTER TABLE ... ADD CONSTRAINT ...;` is not idempotent on its own.
43
+ # PostgreSQL's PL/pgSQL has no IF-NOT-EXISTS clause for ADD CONSTRAINT, so wrap
44
+ # each statement in a DO block that swallows `duplicate_object`.
45
+ # Matches only statements whose ALTER TABLE clause leads directly into ADD CONSTRAINT
46
+ # (no intervening ALTER COLUMN / DROP / etc) so that unrelated ALTER TABLE statements
47
+ # in the same dump are not absorbed.
48
+ ADD_CONSTRAINT_RE = /^[ \t]*ALTER\s+TABLE\s+(?:ONLY\s+)?[^\s;,]+\s+(?:\n[ \t]*)?ADD\s+CONSTRAINT\b[^;]*;/m.freeze
49
+
50
+ def wrap_add_constraint_in_do_block(sql)
51
+ sql.gsub(ADD_CONSTRAINT_RE) do |stmt|
52
+ <<~SQL.chomp
53
+ DO $exwiw$ BEGIN
54
+ #{stmt.strip}
55
+ EXCEPTION WHEN duplicate_object THEN NULL;
56
+ END $exwiw$;
57
+ SQL
58
+ end
59
+ end
60
+ end
61
+ end
@@ -0,0 +1,14 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Exwiw
4
+ class EmbeddedIn
5
+ include Serdes
6
+
7
+ attribute :collection_name, String
8
+ attribute :path, String
9
+
10
+ def self.from_symbol_keys(hash)
11
+ from(hash.transform_keys(&:to_s))
12
+ end
13
+ end
14
+ end
@@ -14,8 +14,34 @@ module Exwiw
14
14
  attribute :fields, array(MongodbField)
15
15
  attribute :bulk_insert_chunk_size, optional(Integer), skip_serializing_if_nil: true
16
16
 
17
+ # Marks this config as physically embedded inside another collection's
18
+ # documents. When set, this config is not processed as a standalone dump
19
+ # unit; its masking rules are applied to the parent's subdocuments at
20
+ # `path`.
21
+ attribute :embedded_in, optional(EmbeddedIn), skip_serializing_if_nil: true
22
+
23
+ def self.from(obj)
24
+ instance = super
25
+ instance.__send__(:validate_embedded!)
26
+ instance
27
+ end
28
+
17
29
  def self.from_symbol_keys(hash)
18
30
  from(JSON.parse(hash.to_json))
19
31
  end
32
+
33
+ def embedded?
34
+ !embedded_in.nil?
35
+ end
36
+
37
+ private def validate_embedded!
38
+ return unless embedded?
39
+ return if belongs_tos.empty?
40
+
41
+ raise ArgumentError,
42
+ "MongodbCollectionConfig '#{name}' is embedded_in '#{embedded_in.collection_name}'; " \
43
+ "belongs_tos must be empty (cross-collection refs from inside embedded arrays " \
44
+ "are not supported)."
45
+ end
20
46
  end
21
47
  end
data/lib/exwiw/runner.rb CHANGED
@@ -20,16 +20,24 @@ module Exwiw
20
20
 
21
21
  def run
22
22
  adapter = Adapter.build(@connection_config, @logger)
23
- tables = load_table_config(adapter.class.table_config_class)
23
+ configs = load_table_config(adapter.class.table_config_class)
24
+
25
+ table_by_name = configs.each_with_object({}) { |config, hash| hash[config.name] = config }
26
+
27
+ target = table_by_name[@dump_target.table_name]
28
+ adapter.validate_as_dump_target!(target) if target
24
29
 
25
30
  @logger.info("Determining table processing order...")
26
- ordered_table_names = DetermineTableProcessingOrder.run(tables)
31
+ ordered_table_names = DetermineTableProcessingOrder.run(configs.select { |c| adapter.dumpable?(c) })
27
32
 
28
33
  if !Dir.exist?(@output_dir)
29
34
  FileUtils.mkdir_p(@output_dir)
30
35
  end
31
36
 
32
- table_by_name = tables.each_with_object({}) { |table, hash| hash[table.name] = table }
37
+ ordered_tables = ordered_table_names.map { |n| table_by_name.fetch(n) }
38
+ schema_path = File.join(@output_dir, "insert-000-schema.#{adapter.schema_output_extension}")
39
+ @logger.info("Writing schema to #{schema_path}...")
40
+ adapter.dump_schema(ordered_tables, schema_path)
33
41
 
34
42
  total_size = ordered_table_names.size
35
43
  ordered_table_names.each_with_index do |table_name, idx|
@@ -54,6 +62,8 @@ module Exwiw
54
62
  insert_idx = (idx + 1).to_s.rjust(3, '0')
55
63
  File.open(File.join(@output_dir, "insert-#{insert_idx}-#{table_name}.#{adapter.output_extension}"), 'w') do |file|
56
64
  file.puts(insert_sql)
65
+ post = adapter.post_insert_sql(table)
66
+ file.puts(post) if post
57
67
  end
58
68
 
59
69
  if adapter.supports_bulk_delete?
@@ -78,14 +88,5 @@ module Exwiw
78
88
  klass.from(json)
79
89
  end
80
90
  end
81
-
82
- private def build_adapter
83
- case @connection_config["adapter"]
84
- when "sqlite3"
85
- Sqlite3Adapter.new(@connection_config)
86
- else
87
- raise "Unsupported adapter"
88
- end
89
- end
90
91
  end
91
92
  end
data/lib/exwiw/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Exwiw
4
- VERSION = "0.1.6"
4
+ VERSION = "0.1.8"
5
5
  end
data/lib/exwiw.rb CHANGED
@@ -8,8 +8,10 @@ require "serdes"
8
8
  require_relative "exwiw/belongs_to"
9
9
  require_relative "exwiw/table_column"
10
10
  require_relative "exwiw/table_config"
11
+ require_relative "exwiw/embedded_in"
11
12
  require_relative "exwiw/mongodb_field"
12
13
  require_relative "exwiw/mongodb_collection_config"
14
+ require_relative "exwiw/ddl_postprocessor"
13
15
  require_relative "exwiw/adapter"
14
16
  require_relative "exwiw/adapter/sqlite3_adapter"
15
17
  require_relative "exwiw/adapter/mysql2_adapter"
data/mise.toml ADDED
@@ -0,0 +1,6 @@
1
+ [env]
2
+ # Prepend scenario/bin so `pg_dump` resolves to the wrapper that delegates to
3
+ # the postgres container (compose.yml). exwiw's PostgreSQL adapter shells out
4
+ # to pg_dump, which requires a server/client major-version match — the dev DB
5
+ # is postgres:17 while host clients are often older (e.g. Homebrew pg14).
6
+ _.path = ["./scenario/bin"]
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: exwiw
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.6
4
+ version: 0.1.8
5
5
  platform: ruby
6
6
  authors:
7
7
  - Shia
@@ -32,10 +32,11 @@ executables:
32
32
  extensions: []
33
33
  extra_rdoc_files: []
34
34
  files:
35
- - ".rubocop.yml"
36
35
  - CHANGELOG.md
37
36
  - LICENSE.txt
38
37
  - README.md
38
+ - docs/plans/2026-05-15-insert-000-schema-file.md
39
+ - docs/plans/2026-05-16-mongodb-from-clean-scenario.md
39
40
  - exe/exwiw
40
41
  - lib/exwiw.rb
41
42
  - lib/exwiw/adapter.rb
@@ -45,7 +46,9 @@ files:
45
46
  - lib/exwiw/adapter/sqlite3_adapter.rb
46
47
  - lib/exwiw/belongs_to.rb
47
48
  - lib/exwiw/cli.rb
49
+ - lib/exwiw/ddl_postprocessor.rb
48
50
  - lib/exwiw/determine_table_processing_order.rb
51
+ - lib/exwiw/embedded_in.rb
49
52
  - lib/exwiw/mongo_query.rb
50
53
  - lib/exwiw/mongodb_collection_config.rb
51
54
  - lib/exwiw/mongodb_field.rb
@@ -58,6 +61,7 @@ files:
58
61
  - lib/exwiw/table_config.rb
59
62
  - lib/exwiw/version.rb
60
63
  - lib/tasks/exwiw.rake
64
+ - mise.toml
61
65
  homepage: https://github.com/riseshia/exwiw
62
66
  licenses:
63
67
  - MIT
@@ -79,7 +83,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
79
83
  - !ruby/object:Gem::Version
80
84
  version: '0'
81
85
  requirements: []
82
- rubygems_version: 3.6.9
86
+ rubygems_version: 4.0.10
83
87
  specification_version: 4
84
88
  summary: Ruby gem that allows you to export records from a database to a dump file.
85
89
  test_files: []
data/.rubocop.yml DELETED
@@ -1,10 +0,0 @@
1
- plugins:
2
- - rubocop-greppable_rails
3
-
4
- AllCops:
5
- TargetRubyVersion: 3.3
6
- DisabledByDefault: true
7
- SuggestExtensions: false
8
-
9
- GreppableRails/UseInlineAccessModifier:
10
- Enabled: true