RubyGems - exwiw - Versions diffs - 0.1.7 → 0.1.8 - Mend

exwiw 0.1.7 → 0.1.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +12 -0
data/README.md +7 -0
data/docs/plans/2026-05-15-insert-000-schema-file.md +151 -0
data/docs/plans/2026-05-16-mongodb-from-clean-scenario.md +76 -0
data/lib/exwiw/adapter/mongodb_adapter.rb +41 -0
data/lib/exwiw/adapter/mysql2_adapter.rb +51 -0
data/lib/exwiw/adapter/postgresql_adapter.rb +75 -0
data/lib/exwiw/adapter/sqlite3_adapter.rb +41 -0
data/lib/exwiw/adapter.rb +24 -0
data/lib/exwiw/ddl_postprocessor.rb +61 -0
data/lib/exwiw/runner.rb +7 -9
data/lib/exwiw/version.rb +1 -1
data/lib/exwiw.rb +1 -0
data/mise.toml +6 -0
metadata +6 -2

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 4d94b3d27454accfa118d2ee8196f8df53ad026de2cce65c23d783b80ff9320d
-  data.tar.gz: 4292c5dca37b34d9a40892440603df62e77e8de62b68375e2990102da83c08f6
+  metadata.gz: 7c5e29a492af74dfbfa0e778fcb527a218d3a33507024646b2fe88b495581c2f
+  data.tar.gz: bd66516a56f40e4147e76e3fc662c98cd3c0261eb54a310ef7af49bcc5373cf0
 SHA512:
-  metadata.gz: 0cae5f397aff3258f7115625e2828d17579b754982287117384257d9858c5867d063b200954d978e54a95e86c5edf919203084f0913fdfdc2a156fde3f71d1cc
-  data.tar.gz: 41705e1dbcb3a9664e4fdeeaacd1da6b49f35131b1d03191adfdcc8c101bd348bc22d96fc4ebc6daf68fa67b4b70ce68a8f29eecdd986021b0ef8e9685191331
+  metadata.gz: 1b63d52ce0abd624695b73d782c64a5d4dd861e701f7c5989e977dd506d0178f4e2d394e9ae57e1106bf83898b622bd93681c60bdfbbde7dcd72bf1796cd4847
+  data.tar.gz: 36ea50859424c45eb3bd83ffc84fb148eb4080345e7dfc012ff81b32003fa27e83f43324c4e7fa296c4d35ee8de6f89754a9e9f1449bd69d3d1b4066015e51bc

data/CHANGELOG.md CHANGED Viewed

@@ -2,6 +2,18 @@
 ## [Unreleased]
+## [0.1.8] - 2026-05-16
+### Added
+- Emit a leading `insert-000-schema.{sql,js}` file alongside the per-table `insert-*` files so the generated dump can be applied to an empty database in one go. ([#14](https://github.com/heyinc/exwiw/pull/14))
+  - SQL adapters (`mysql2`, `postgresql`, `sqlite3`) write idempotent `CREATE TABLE IF NOT EXISTS` (and `CREATE INDEX IF NOT EXISTS` where the engine supports it) by shelling out to `mysqldump` / `pg_dump` / reading `sqlite_master`. PostgreSQL `ALTER TABLE ... ADD CONSTRAINT` is wrapped in a `DO $$ EXCEPTION WHEN duplicate_object` block.
+  - MongoDB adapter writes `insert-000-schema.js` containing `db.createCollection(...)` (wrapped in `try/catch` for `NamespaceExists`) and `db.<col>.createIndex(...)` calls for every top-level collection. Apply with `mongosh < dump/insert-000-schema.js`.
+### Fixed
+- PostgreSQL adapter now appends a `setval` for each table's sequence at the end of the `insert-*.sql` file, transcribing the source DB's `last_value` so `nextval` after restore does not collide with imported IDs. ([#19](https://github.com/heyinc/exwiw/pull/19))
 ## [0.1.7] - 2026-05-14
 ### Added

data/README.md CHANGED Viewed

@@ -64,12 +64,15 @@ exwiw \
 This command will generate sql files in the `dump` directory.
+- `dump/insert-000-schema.sql` — idempotent `CREATE TABLE IF NOT EXISTS ...` for every table in scope. Apply this first to provision an empty database.
 - `dump/insert-{idx}-{table_name}.sql`
 - `dump/delete-{idx}-{table_name}.sql`
 idx means the order of the dump. bigger idx might depend on smaller idx,
 so you should import the dump in order.
+`insert-000-schema.sql` is generated by shelling out to the database client tools (`mysqldump` for `mysql2`, `pg_dump` for `postgresql`, and the sqlite3 driver for `sqlite3`), so the corresponding client must be available on PATH when running exwiw. The output is post-processed to make it idempotent: `CREATE TABLE IF NOT EXISTS`, `CREATE INDEX IF NOT EXISTS` (where the engine supports it), and PostgreSQL's `ALTER TABLE ... ADD CONSTRAINT` statements are wrapped in `DO $$ ... EXCEPTION WHEN duplicate_object`.
 you need to delete the records before importing the dump,
 `delete-{idx}-{table_name}.sql` will help you to do that.
 This sql will delete "all" related records to the extract targets.
@@ -182,6 +185,10 @@ The MongoDB adapter is experimental. To use it:
   ```bash
   mongoimport --db app_dev --collection users --file dump/insert-002-users.jsonl
   ```
+- The leading `dump/insert-000-schema.js` contains `db.createCollection(...)` and `db.<col>.createIndex(...)` calls for every top-level collection (indexes are introspected from the source via `listIndexes`; the auto-created `_id_` index is skipped). Apply it with mongosh **before** running `mongoimport`:
+  ```bash
+  mongosh "mongodb://localhost/app_dev" dump/insert-000-schema.js
+  ```
 - Unlike SQL adapters, the MongoDB adapter does not emit `delete-*.jsonl` files (drop the database / collection yourself before importing if needed).
 - `raw_sql` is not supported (the `MongodbField` schema does not declare it; any `raw_sql` keys in scenario JSON are silently dropped on load). Use `replace_with` for masking.
 - The MongoDB adapter does not support the collection-level `filter` field (it raises `NotImplementedError` if set, since the SQL-string filter cannot be applied to MongoDB).

data/docs/plans/2026-05-15-insert-000-schema-file.md ADDED Viewed

@@ -0,0 +1,151 @@
+# Plan: `insert-000-schema.{sql,js}` を dump 出力に追加する
+## Context
+現在 `exwiw` の dump 出力 (`dump/`) には `insert-NNN-{table}.sql` と `delete-NNN-{table}.sql` のみが書かれ、`CREATE TABLE` などのスキーマ定義は別管理になっている。そのため取得した dump を別環境にインポートしようとすると、利用者側で別途スキーマを用意する必要があり、空 DB に流すと失敗する／idempotent に流せない。
+ゴール: 既存の per-table insert ファイルの前段として、
+- **SQL adapters** (`mysql2` / `postgresql` / `sqlite3`): `insert-000-schema.sql` に `CREATE TABLE IF NOT EXISTS ...` などをまとめて出力する
+- **MongoDB adapter**: `insert-000-schema.js` に `db.createCollection(...)` と `db.<col>.createIndex(...)` をまとめて出力する
+これにより、空 DB に対しても `insert-000-schema.*` → `insert-001-...` → ... の順で適用するだけで dump が完結する。
+## Design
+### 採用した方針 (ユーザ確認済み)
+- DDL の取得元は **外部コマンドのシェルアウト** (SQL 系は `mysqldump` / `pg_dump` / `sqlite3 .schema`)。MongoDB は CLI に相当するものがないので、既に require 済みの `mongo` Ruby ドライバで `listCollections` / `listIndexes` を使う (mongosh を追加要件にしない)。
+- 出力は **1 ファイルに統合**: `insert-000-schema.{sql,js}`
+- スコープは **DB 上の完全定義をそのまま転写**。config の `columns` / `fields` で絞り込まない。
+- Mongo は **mongosh で流せる .js** を出力。
+### アーキテクチャ
+1. `Adapter::Base` に新規メソッドを追加 (デフォルトは no-op):
+   - `dump_schema(ordered_tables, output_path, logger)` — `output_path` (ファイル絶対パス) にスキーマ DDL を書く
+   - `schema_output_extension` — デフォルト `'sql'`、Mongo は override で `'js'`
+2. `Runner#run` の `mkdir_p` 直後、per-table ループの前で 1 回だけ呼び出す:
+   ```ruby
+   ordered_tables = ordered_table_names.map { |n| table_by_name.fetch(n) }
+   schema_path = File.join(@output_dir, "insert-000-schema.#{adapter.schema_output_extension}")
+   adapter.dump_schema(ordered_tables, schema_path, @logger)
+   ```
+   - `ordered_table_names` は依存順に並んでいる (`DetermineTableProcessingOrder`) ので、FK 制約付き DDL もそのまま流せる。
+3. 各 adapter の `dump_schema` 実装:
+   **Sqlite3Adapter** — 既に require 済みの `sqlite3` gem の connection を使い `SELECT type, sql FROM sqlite_master WHERE sql IS NOT NULL ORDER BY CASE type WHEN 'table' THEN 0 WHEN 'index' THEN 1 ELSE 2 END` を流す。`sqlite_sequence` 等の自動テーブル (`sql` が NULL) は除外される。各行の `sql` を後処理して:
+   - `CREATE TABLE ` → `CREATE TABLE IF NOT EXISTS `
+   - `CREATE INDEX ` → `CREATE INDEX IF NOT EXISTS ` (SQLite はサポート)
+   - `CREATE UNIQUE INDEX ` → 同上
+   - `CREATE TRIGGER ` / `CREATE VIEW ` も同様に IF NOT EXISTS を付与
+   設計上の選択: 「シェルアウト」をユーザは選んだが、SQLite については既存接続を再利用する方が安全 (外部 `sqlite3` バイナリの有無に依存しない)。シェルアウトに統一したい場合はオプションBへ。
+   **Mysql2Adapter** — `mysqldump` をシェルアウト:
+   ```
+   MYSQL_PWD=$password mysqldump \
+     --host={host} --port={port} --user={user} \
+     --no-data --skip-add-drop-table --skip-comments \
+     --skip-extended-insert --skip-set-charset \
+     --compact \
+     {database}
+   ```
+   - パスワードはコマンドライン引数ではなく `MYSQL_PWD` env 経由 (プロセス一覧に出さない)
+   - stdout を後処理して `CREATE TABLE ` → `CREATE TABLE IF NOT EXISTS `。MySQL は `CREATE INDEX IF NOT EXISTS` を **サポートしない** が、mysqldump はインデックスを `CREATE TABLE` 内にインラインで含めるため通常問題なし
+   - `/*!40101 ...*/` のような実行時 SET 文はそのまま残す (mysqldump が import 互換のために出すもの)
+   **PostgresqlAdapter** — `pg_dump` をシェルアウト:
+   ```
+   PGPASSWORD=$password pg_dump \
+     --host={host} --port={port} --username={user} \
+     --schema-only --no-owner --no-acl --no-comments \
+     {database}
+   ```
+   - stdout を後処理して `CREATE TABLE ` → `CREATE TABLE IF NOT EXISTS `、`CREATE INDEX ` → `CREATE INDEX IF NOT EXISTS `、`CREATE UNIQUE INDEX ` → `CREATE UNIQUE INDEX ... IF NOT EXISTS` (PG は両方サポート)
+   - `ALTER TABLE ... ADD CONSTRAINT` は重複適用で失敗する。**`information_schema.table_constraints` で存在チェックする `DO $$` ブロックにラップする** か、よりシンプルには `ALTER TABLE ONLY ... ADD CONSTRAINT` 行を検出して `IF NOT EXISTS` 版 (PG 9.6+ の `ALTER TABLE ADD CONSTRAINT` には IF NOT EXISTS はないので) **DO ブロックでラップ** する。実装方針: 各 `ALTER TABLE ... ADD CONSTRAINT "name" ...;` を:
+     ```sql
+     DO $$ BEGIN
+       ALTER TABLE ... ADD CONSTRAINT "name" ...;
+     EXCEPTION WHEN duplicate_object THEN NULL; END $$;
+     ```
+     でラップ。
+   - `SET ...` / `SELECT pg_catalog.set_config(...)` 行はそのまま残す。
+   - `CREATE SCHEMA public` は素の pg_dump では出ないが、出力に含まれた場合は `CREATE SCHEMA IF NOT EXISTS` に書き換える。
+   **MongodbAdapter** — 既に require 済みの `mongo` ドライバを使い、`tables.reject(&:embedded?)` をループ:
+   ```ruby
+   db.list_collections.each { |c| existing_collections << c['name'] }  # for skip-creation idempotency hint
+   ordered_tables.reject(&:embedded?).each do |config|
+     indexes = db[config.name].indexes.to_a.reject { |idx| idx['name'] == '_id_' }
+     # emit JS lines
+   end
+   ```
+   出力 JS のサンプル:
+   ```js
+   // Auto-generated by exwiw. Apply with: mongosh < insert-000-schema.js
+   try { db.createCollection("users"); } catch (e) { if (e.code !== 48) throw e; } // 48 = NamespaceExists
+   db.users.createIndex({"shop_id":1}, {"name":"index_users_on_shop_id"});
+   try { db.createCollection("shops"); } catch (e) { if (e.code !== 48) throw e; }
+   ```
+   - `_id_` index は MongoDB が自動作成するので除外
+   - `createIndex` はキー仕様とオプションが一致していれば idempotent (重複作成は no-op)
+   - index doc には `v` / `ns` など driver 由来のメタが入るので、`key` と `name` 以外で出力に含めるのは `unique` / `sparse` / `partialFilterExpression` / `expireAfterSeconds` / `collation` のみ (allowlist) にする
+   - `MongodbAdapter#schema_output_extension` で `'js'` を返す
+   - 埋め込み (embedded_in) は親 collection に内包されるので index 不要 → スキップ
+### 後処理ユーティリティ
+新規ファイル `lib/exwiw/ddl_postprocessor.rb` を作って各 SQL adapter から呼ぶ:
+- `.add_if_not_exists_to_create_table(sql)` — 行頭の `CREATE TABLE ` → `CREATE TABLE IF NOT EXISTS ` (既に IF NOT EXISTS が付いている場合はスキップ)
+- `.add_if_not_exists_to_create_index(sql)` — `CREATE [UNIQUE] INDEX ` 系
+- `.wrap_add_constraint_in_do_block(sql)` — PG 専用
+### CLI 側の挙動
+特に CLI フラグの追加はしない。常に `insert-000-schema.*` を出力する (ユーザの選択にあった通り)。ただし、外部コマンドが PATH に無い場合は明確なエラーメッセージで停止する:
+```
+Error: `pg_dump` not found in PATH. exwiw needs pg_dump to generate insert-000-schema.sql for the postgresql adapter.
+```
+## Files to modify / add
+| パス | 変更 |
+|---|---|
+| `lib/exwiw/adapter.rb` | `Base` に `dump_schema(tables, path, logger)` と `schema_output_extension` を追加 |
+| `lib/exwiw/runner.rb` | `mkdir_p` 直後で `adapter.dump_schema(...)` を呼ぶ |
+| `lib/exwiw/adapter/sqlite3_adapter.rb` | `dump_schema` 実装 (既存 connection を使い `sqlite_master` 経由) |
+| `lib/exwiw/adapter/mysql2_adapter.rb` | `dump_schema` 実装 (`mysqldump` シェルアウト) |
+| `lib/exwiw/adapter/postgresql_adapter.rb` | `dump_schema` 実装 (`pg_dump` シェルアウト, ADD CONSTRAINT DO ブロック化) |
+| `lib/exwiw/adapter/mongodb_adapter.rb` | `dump_schema` 実装、`schema_output_extension` を override |
+| `lib/exwiw/ddl_postprocessor.rb` (新規) | `IF NOT EXISTS` 書き換え / DO ブロックラップ |
+| `lib/exwiw.rb` | 新規ファイルの require |
+| `README.md` | `dump/` の出力に `insert-000-schema.{sql,js}` を追記、import 手順を更新 |
+| `spec/adapter/sqlite3_adapter_spec.rb` | `dump_schema` 統合テスト (`scenario/initdb/init.sqlite3` に対して実行し、出力が `CREATE TABLE IF NOT EXISTS` を含むことを assert) |
+| `spec/adapter/mongodb_adapter_spec.rb` | `dump_schema` テスト (db スタブで `listIndexes` を返し、出力 JS を assert) |
+| `spec/runner_spec.rb` | `insert-000-schema.sql` が `output_dir` に書かれることを assert (Sqlite3 経由で実際に流れることを確認) |
+## 再利用する既存コード
+- `lib/exwiw/connection_config.rb` (host / port / user / password / database_name) — シェルアウトの引数組み立てに使う
+- `lib/exwiw/determine_table_processing_order.rb` — schema dump も依存順で並べるためそのまま使う
+- `MongodbCollectionConfig#embedded?` (lib/exwiw/mongodb_collection_config.rb:33) — 埋め込み collection をスキップ
+- `MongodbAdapter#db` (lib/exwiw/adapter/mongodb_adapter.rb:197) — Mongo クライアントを取り出す既存 lazy getter
+## Verification
+### ユニット / 統合テスト
+1. `bundle exec rspec spec/runner_spec.rb spec/adapter/sqlite3_adapter_spec.rb` — sqlite3 経路で `insert-000-schema.sql` が生成され、内容に `CREATE TABLE IF NOT EXISTS "shops"` が含まれることを確認。
+2. `bundle exec rspec spec/adapter/mongodb_adapter_spec.rb` — mongo クライアントをスタブして JS 出力に `db.createCollection("users")` と該当 collection の `createIndex(...)` が含まれることを確認。
+### E2E (scenario スクリプト経由)
+3. `scenario/test_with_sqlite3.sh` を実行し、`dump/insert-000-schema.sql` が生成されることと、空 DB に対して `sqlite3 empty.db < dump/insert-000-schema.sql && for f in dump/insert-*.sql; do sqlite3 empty.db < $f; done` が成功することを確認する。
+4. `scenario/test_with_mysql2.sh`, `scenario/test_with_postgresql.sh` も同様に、`mysql empty_db < dump/insert-000-schema.sql` / `psql empty_db -f dump/insert-000-schema.sql` が成功 → 続けて insert ファイル群が流せることを確認。**`mysqldump` / `pg_dump` を docker compose のコンテナ内 (`compose.yml` で起動する DB コンテナ) で実行する必要がある場合は、scenario スクリプトを更新する。**
+5. `scenario/test_with_mongodb.sh` を実行し、`dump/insert-000-schema.js` が出力されることと、空 DB に対して `mongosh "mongodb://localhost/empty_db" < dump/insert-000-schema.js` が成功すること、続いて `mongoimport` で各 jsonl が流せることを確認。
+6. **idempotency 確認**: 同じ schema ファイルを 2 回流してもエラーにならないこと (`IF NOT EXISTS` / `DO $$ EXCEPTION WHEN duplicate_object` / `try/catch on createCollection` が効いている)。
+### 手動確認のチェックポイント
+- `mysqldump` / `pg_dump` が PATH にない環境で実行した場合、わかりやすいエラーで止まる
+- `DATABASE_PASSWORD` env が無い場合に外部コマンドが認証エラーで落ちないこと (シェルアウト時にも `MYSQL_PWD` / `PGPASSWORD` を渡す)
+- `CHANGELOG.md` への追記
+## 留意点 / 既知のリスク
+- **MySQL の `CREATE INDEX IF NOT EXISTS` 非対応**: mysqldump はインデックスをテーブル定義内にインラインで吐くので通常問題にならないが、`--no-create-info --no-data --routines --triggers` などのオプションで吐き分ける場合は別途対応が必要。今回はデフォルト挙動のみサポート。
+- **PG の COMMENT ON / GRANT / SEQUENCE OWNED BY**: `--no-owner --no-acl --no-comments` で削減できる。SEQUENCE 自体の `CREATE SEQUENCE` は `IF NOT EXISTS` 書き換え対象に含める。
+- **mongosh 依存**: 出力 JS を流すこと自体には mongosh が必要 (README に明記)。exwiw 本体はあくまで Ruby driver のみで生成するので、exwiw 実行ホストには mongosh は不要。
+- **巨大 schema**: pg_dump / mysqldump の出力をメモリに乗せて後処理するので、超巨大スキーマだとメモリ使用量が増える。実用上は問題にならない見込み。

data/docs/plans/2026-05-16-mongodb-from-clean-scenario.md ADDED Viewed

@@ -0,0 +1,76 @@
+# Plan: MongoDB の `insert-000-schema.js` を scenario で end-to-end 検証する
+## Context
+`lib/exwiw/adapter/mongodb_adapter.rb#dump_schema` は `insert-000-schema.js` に
+`createCollection` / `createIndex` を書き出す実装を既に持っているが、scenario 側で
+これを apply するパスが無く、CI でも検証できていなかった。具体的なギャップ:
+1. `scenario/setup_with_mongodb.rb` は seed を `insert_many` で流すだけで、index を一切作っていない
+2. その結果 `tmp/mongodb/insert-000-schema.js` は `createCollection` 行のみで `createIndex` が 0 行
+3. `scenario/import_with_mongodb.rb` は `insert-*.jsonl` だけを glob して処理しており、`insert-000-schema.js` を一切実行しない
+sqlite3 / mysql2 / postgresql で導入済みの「from clean DB から立ち上げる」流れと
+MongoDB の `insert-000-schema.js` が連動していない状態だった (issue #16)。
+## ゴール
+- 空の target DB に対して `mongosh insert-000-schema.js` → `insert-*.jsonl` の順で適用する scenario を CI に乗せる
+- source DB に代表的な index を作り、`dump_schema` が `createIndex` 行を実際に吐く状態を作る
+- 生成された createIndex 行が mongosh で実際に通ること、target 側で index が round-trip することを検証
+- 既存の snapshot test (`spec/insert_output_snapshot_spec.rb`) でも createIndex 行を固定化
+## 変更内容
+### scenario 層
+| パス | 変更 |
+|---|---|
+| `scenario/setup_with_mongodb.rb` | seed 流し込みの後に 3 種類の代表的 index を作る (unique `shops.name` / plain `users.email` / 複合 `orders.shop_id+user_id`) |
+| `scenario/import_with_mongodb.rb` | `--no-drop` と `--input-dir DIR` フラグを追加。from-clean は drop すると schema.js が作った index ごと消えてしまうため |
+| `scenario/verify_with_mongodb.rb` | `--with-indexes` で target collection の index を assert (default scenario では import 時に drop されるのでスキップ) |
+| `scenario/test_with_mongodb_from_clean.sh` (新規) | `mongosh dropDatabase` → exwiw 実行 → `mongosh insert-000-schema.js` → `import --no-drop --input-dir tmp/mongodb-clean` → `verify --with-indexes` |
+| `.github/workflows/scenario.yml` | with_mongodb job に `mongodb-mongosh` install ステップと `test_with_mongodb_from_clean.sh` 実行ステップを追加。apt repo の codename は `jammy` 固定 (ubuntu-latest が noble に上がる前提) |
+### snapshot test 層
+| パス | 変更 |
+|---|---|
+| `spec/support/bootstrap_databases.rb` | scenario と同じ 3 index を bootstrap で作る |
+| `spec/insert_output_snapshots/mongodb/insert-000-schema.js` | 3 つの `db.getCollection(...).createIndex(...)` 行が追加される形で再生成 |
+## 設計上の判断
+- **unique index は `users.email` ではなく `shops.name` に貼る**: seed の `users.email`
+  は `user1@example.com` が 5 shop に重複するので unique にできない。`shops.name`
+  ("Shop 1".."Shop 5") は seed 上一意なので unique 可。
+- **既存 scenario への副作用を最小化**: `import_with_mongodb.rb` のデフォルト挙動は変えず
+  `--no-drop` フラグで opt-in。既存 `test_with_mongodb.sh` は無修正で動く。
+- **verify を 2 用途で兼用**: `--with-indexes` 切り替えで from-clean のみ index を見る。
+  既存 scenario は drop→insert で index が無くなるため index 検証はスキップ。
+- **CI への mongosh install**: `mongo:7` service container には mongosh があるが、
+  ubuntu-latest 上の `mongosh` コマンドは別。MongoDB の apt repo (`mongodb-mongosh`
+  パッケージ) を入れる。codename は `jammy` 固定 (MongoDB 7.0 repo が noble を
+  carry していない時期があるため)。
+- **snapshot fixture を indexes 入りに更新**: bootstrap_databases.rb と
+  setup_with_mongodb.rb で同じ index を作るので、snapshot test と scenario test の
+  期待値が分岐しない。
+## Verification
+- `bash scenario/test_with_mongodb.sh` 既存 scenario 維持を確認 ✓
+- `bash scenario/test_with_mongodb_from_clean.sh` 新規 scenario 通過を確認
+  (indexes round-trip OK) ✓
+- `bundle exec rspec` 全 153 examples / 0 failures ✓
+- `tmp/mongodb-clean/insert-000-schema.js` を目視で確認:
+  ```js
+  db.getCollection("shops").createIndex({"name":1}, {"unique":true,"name":"idx_shops_name"});
+  db.getCollection("users").createIndex({"email":1}, {"name":"idx_users_email"});
+  db.getCollection("orders").createIndex({"shop_id":1,"user_id":1}, {"name":"idx_orders_shop_user"});
+  ```
+## 留意点
+- `import_with_mongodb.rb` のフラグ解析は手書きの ARGV パース。引数が増えるなら
+  OptionParser 化を検討する余地あり (現状は 2 フラグなので過剰)。
+- ubuntu-latest が将来 codename を変えても apt repo の `jammy` 指定は壊れない想定だが、
+  MongoDB 8.x へ移行する際は repo URL の `7.0` も更新が必要。
+- 既存 issue #16 のスコープは MongoDB のみ。SQL 系の from_clean は別 PR で導入済み。

data/lib/exwiw/adapter/mongodb_adapter.rb CHANGED Viewed

@@ -101,6 +101,47 @@ module Exwiw
         'jsonl'
       end
+      def schema_output_extension
+        'js'
+      end
+      # Index options copied through to the emitted createIndex call. Anything
+      # else (`v`, `ns`, server-internal fields) is dropped — they would either
+      # be rejected by createIndex or are not portable across mongod versions.
+      INDEX_OPTION_ALLOWLIST = %w[
+        unique sparse hidden expireAfterSeconds collation
+        partialFilterExpression wildcardProjection
+      ].freeze
+      def dump_schema(ordered_tables, output_path)
+        require 'json'
+        collections = ordered_tables.reject(&:embedded?)
+        File.open(output_path, 'w') do |file|
+          file.puts("// Auto-generated by exwiw. Apply with: mongosh \"$MONGODB_URI\" #{File.basename(output_path)}")
+          file.puts
+          collections.each do |config|
+            name = config.name
+            file.puts(%(try { db.createCollection(#{JSON.generate(name)}); } catch (e) { if (e.code !== 48) throw e; }))
+          end
+          file.puts
+          collections.each do |config|
+            name = config.name
+            indexes = db[name].indexes.to_a.reject { |idx| idx['name'] == '_id_' }
+            indexes.each do |idx|
+              key = idx['key']
+              opts = idx.slice(*INDEX_OPTION_ALLOWLIST)
+              opts['name'] = idx['name'] if idx['name']
+              file.puts(%(db.getCollection(#{JSON.generate(name)}).createIndex(#{JSON.generate(key)}, #{JSON.generate(opts)});))
+            end
+          end
+        end
+        @logger.info("  Wrote schema for #{collections.size} collection(s) to #{output_path}.")
+      end
       def supports_bulk_delete?
         false
       end

data/lib/exwiw/adapter/mysql2_adapter.rb CHANGED Viewed

@@ -14,6 +14,57 @@ module Exwiw
         connection.query(sql, cast: false, as: :array).to_a
       end
+      def dump_schema(ordered_tables, output_path)
+        require 'open3'
+        table_names = ordered_tables.map(&:name)
+        if table_names.empty?
+          File.write(output_path, "-- Auto-generated by exwiw. No tables in scope.\n")
+          return
+        end
+        cmd = [
+          'mysqldump',
+          "--host=#{@connection_config.host}",
+          "--port=#{@connection_config.port}",
+          "--user=#{@connection_config.user}",
+          '--no-data',
+          '--skip-add-drop-table',
+          # `--skip-comments` only suppresses the dump's header lines
+          # (e.g. `-- MySQL dump ...`, server version banner). Column and
+          # table `COMMENT '...'` clauses are emitted inline inside
+          # CREATE TABLE statements and are NOT affected, so this flag is
+          # purely about reducing noise in the generated file.
+          '--skip-comments',
+          '--skip-set-charset',
+          # Suppress `SET @@GLOBAL.GTID_PURGED=...` from the dump. It is intended
+          # for replication setup and breaks when the target already has GTIDs
+          # (ERROR 3546: added gtid set must not overlap with @@GLOBAL.GTID_EXECUTED).
+          '--set-gtid-purged=OFF',
+          '--compact',
+          @connection_config.database_name,
+          *table_names,
+        ]
+        env = { 'MYSQL_PWD' => @connection_config.password.to_s }
+        @logger.debug("  Running mysqldump for #{table_names.size} table(s)...")
+        stdout, stderr, status = Open3.capture3(env, *cmd)
+        unless status.success?
+          if stderr.include?('command not found') || stderr.empty?
+            raise "Failed to run `mysqldump`. Ensure the mysql client is installed and on PATH. stderr: #{stderr}"
+          end
+          raise "mysqldump failed (exit #{status.exitstatus}): #{stderr}"
+        end
+        idempotent = DdlPostprocessor.add_if_not_exists_to_create_table(stdout)
+        File.open(output_path, 'w') do |file|
+          file.puts("-- Auto-generated by exwiw via mysqldump. Idempotent CREATE TABLE statements for mysql.")
+          file.write(idempotent)
+        end
+        @logger.info("  Wrote schema for #{table_names.size} table(s) to #{output_path}.")
+      end
       def to_bulk_insert(results, table)
         table_name = table.name

data/lib/exwiw/adapter/postgresql_adapter.rb CHANGED Viewed

@@ -14,6 +14,51 @@ module Exwiw
         connection.exec(sql).values
       end
+      def dump_schema(ordered_tables, output_path)
+        require 'open3'
+        table_names = ordered_tables.map(&:name)
+        if table_names.empty?
+          File.write(output_path, "-- Auto-generated by exwiw. No tables in scope.\n")
+          return
+        end
+        cmd = [
+          'pg_dump',
+          "--host=#{@connection_config.host}",
+          "--port=#{@connection_config.port}",
+          "--username=#{@connection_config.user}",
+          '--schema-only',
+          '--no-owner',
+          '--no-acl',
+          *table_names.flat_map { |t| ['--table', t] },
+          @connection_config.database_name,
+        ]
+        env = { 'PGPASSWORD' => @connection_config.password.to_s }
+        @logger.debug("  Running pg_dump for #{table_names.size} table(s)...")
+        stdout, stderr, status = Open3.capture3(env, *cmd)
+        unless status.success?
+          if stderr.include?('command not found') || stderr.empty?
+            raise "Failed to run `pg_dump`. Ensure the postgresql client is installed and on PATH. stderr: #{stderr}"
+          end
+          raise "pg_dump failed (exit #{status.exitstatus}): #{stderr}"
+        end
+        idempotent = stdout
+        idempotent = DdlPostprocessor.add_if_not_exists_to_create_schema(idempotent)
+        idempotent = DdlPostprocessor.add_if_not_exists_to_create_sequence(idempotent)
+        idempotent = DdlPostprocessor.add_if_not_exists_to_create_table(idempotent)
+        idempotent = DdlPostprocessor.add_if_not_exists_to_create_index(idempotent)
+        idempotent = DdlPostprocessor.wrap_add_constraint_in_do_block(idempotent)
+        File.open(output_path, 'w') do |file|
+          file.puts("-- Auto-generated by exwiw via pg_dump. Idempotent DDL for postgresql.")
+          file.write(idempotent)
+        end
+        @logger.info("  Wrote schema for #{table_names.size} table(s) to #{output_path}.")
+      end
       def to_bulk_insert(results, table)
         table_name = table.name
@@ -29,6 +74,36 @@ module Exwiw
         "INSERT INTO #{table_name} (#{column_names}) VALUES\n#{values};"
       end
+      # Transcribe the FROM-side sequence cursor backing `table.primary_key`
+      # onto the import target. Without this, importing into a clean DB leaves
+      # the sequence at 1 while the inserted rows occupy higher IDs, so the
+      # next default-PK INSERT collides. We query FROM's `last_value` /
+      # `is_called` directly (matching what pg_dump emits) rather than using
+      # MAX(pk), so a subsetted dump still preserves the source's "next id".
+      # Returns nil for non-auto-increment PKs (pg_get_serial_sequence -> NULL).
+      #
+      # Scope: ONLY the sequence attached to the primary key is synced. If a
+      # table has additional auto-increment columns (e.g. a non-PK SERIAL),
+      # those sequences are NOT transcribed and a subsequent default-value
+      # INSERT on them can collide. Rails-managed schemas don't hit this
+      # because only `id` is auto-increment, but bare PostgreSQL schemas may.
+      def post_insert_sql(table)
+        pk = table.primary_key
+        return nil if pk.nil? || pk.empty?
+        seq_name = connection
+          .exec_params("SELECT pg_get_serial_sequence($1, $2)", [table.name, pk])
+          .values.dig(0, 0)
+        return nil if seq_name.nil?
+        last_value, is_called = connection
+          .exec("SELECT last_value, is_called FROM #{seq_name}")
+          .values.first
+        is_called_sql = (is_called == 't' || is_called == true) ? 'true' : 'false'
+        "SELECT pg_catalog.setval('#{escape_single_quote(seq_name)}', #{last_value}, #{is_called_sql});"
+      end
       def to_bulk_delete(select_query_ast, table)
         raise NotImplementedError unless select_query_ast.is_a?(Exwiw::QueryAst::Select)

data/lib/exwiw/adapter/sqlite3_adapter.rb CHANGED Viewed

@@ -14,6 +14,47 @@ module Exwiw
         connection.execute(sql)
       end
+      def dump_schema(ordered_tables, output_path)
+        @logger.debug("  Reading schema from sqlite_master...")
+        target_names = ordered_tables.map(&:name)
+        # `sqlite_master` row order preserves table creation order, which is also
+        # the dependency order produced by ActiveRecord-style migrations. To respect
+        # the caller-provided order, we partition tables / their owned indexes by
+        # ordered_tables.
+        all = connection.execute(<<~SQL)
+          SELECT type, name, tbl_name, sql FROM sqlite_master
+          WHERE sql IS NOT NULL AND name NOT LIKE 'sqlite_%'
+        SQL
+        tables_by_name   = all.select { |type, _, _, _| type == 'table'   }.to_h { |_, name, _, sql| [name, sql] }
+        indexes_by_owner = all.select { |type, _, _, _| type == 'index'   }.group_by { |_, _, tbl, _| tbl }
+        triggers_by_owner = all.select { |type, _, _, _| type == 'trigger' }.group_by { |_, _, tbl, _| tbl }
+        statements = []
+        target_names.each do |name|
+          table_sql = tables_by_name[name]
+          next unless table_sql
+          statements << finalize_stmt(DdlPostprocessor.add_if_not_exists_to_create_table(table_sql.strip))
+          (indexes_by_owner[name] || []).each do |_, _, _, idx_sql|
+            statements << finalize_stmt(DdlPostprocessor.add_if_not_exists_to_create_index(idx_sql.strip))
+          end
+          (triggers_by_owner[name] || []).each do |_, _, _, trg_sql|
+            statements << finalize_stmt(trg_sql.strip)
+          end
+        end
+        File.open(output_path, 'w') do |file|
+          file.puts("-- Auto-generated by exwiw. Idempotent CREATE statements for sqlite3.")
+          file.puts(statements.join("\n"))
+        end
+        @logger.info("  Wrote #{statements.size} schema statement(s) to #{output_path}.")
+      end
+      private def finalize_stmt(stmt)
+        stmt.end_with?(';') ? stmt : "#{stmt};"
+      end
       def to_bulk_insert(results, table)
         table_name = table.name

data/lib/exwiw/adapter.rb CHANGED Viewed

@@ -30,6 +30,22 @@ module Exwiw
         'sql'
       end
+      # File extension used for the leading `insert-000-schema.*` file.
+      # SQL adapters emit `.sql` (CREATE TABLE IF NOT EXISTS ...);
+      # MongodbAdapter overrides to `.js` (mongosh-runnable createCollection / createIndex).
+      def schema_output_extension
+        'sql'
+      end
+      # Write the leading schema-creation file for this adapter to `output_path`.
+      # Default is a no-op; subclasses override to emit idempotent DDL so the
+      # generated dump can be applied to an empty database.
+      #
+      # @param ordered_tables [Array] table configs in dependency order
+      # @param output_path [String] absolute path to write to
+      def dump_schema(ordered_tables, output_path)
+      end
       # Whether this adapter emits delete-NNN-*.sql files.
       def supports_bulk_delete?
         true
@@ -46,6 +62,14 @@ module Exwiw
       # dump_target. Default: nothing to validate.
       def validate_as_dump_target!(_config)
       end
+      # Optional SQL appended to the per-table insert-NNN-<table>.* file after
+      # the bulk INSERT statements. Use to bring side-state in sync with the
+      # explicit IDs that were just inserted (e.g. PostgreSQL sequences).
+      # Default: nil (nothing appended).
+      def post_insert_sql(_table)
+        nil
+      end
     end
     # @params [Exwiw::QueryAst] query_ast

data/lib/exwiw/ddl_postprocessor.rb ADDED Viewed

@@ -0,0 +1,61 @@
+# frozen_string_literal: true
+module Exwiw
+  # Rewrites raw CREATE statements emitted by mysqldump / pg_dump /
+  # sqlite_master.sql into idempotent forms so the generated
+  # `insert-000-schema.sql` file can be re-applied without error.
+  module DdlPostprocessor
+    module_function
+    # `CREATE TABLE [name]` → `CREATE TABLE IF NOT EXISTS [name]`.
+    # `TEMP` / `TEMPORARY` variants and already-IF-NOT-EXISTS lines are skipped.
+    def add_if_not_exists_to_create_table(sql)
+      sql.gsub(/\bCREATE\s+TABLE\b(?!\s+IF\s+NOT\s+EXISTS)/i) do |m|
+        "#{m} IF NOT EXISTS"
+      end
+    end
+    # `CREATE [UNIQUE] INDEX [name]` → `CREATE [UNIQUE] INDEX IF NOT EXISTS [name]`.
+    # Use only for databases that support it (PostgreSQL, SQLite). MySQL does NOT
+    # support `CREATE INDEX IF NOT EXISTS` — do not call from the MySQL adapter.
+    def add_if_not_exists_to_create_index(sql)
+      sql.gsub(/\bCREATE(\s+UNIQUE)?\s+INDEX\b(?!\s+IF\s+NOT\s+EXISTS)/i) do
+        unique = Regexp.last_match(1) || ""
+        "CREATE#{unique} INDEX IF NOT EXISTS"
+      end
+    end
+    # `CREATE SCHEMA [name]` → `CREATE SCHEMA IF NOT EXISTS [name]`.
+    def add_if_not_exists_to_create_schema(sql)
+      sql.gsub(/\bCREATE\s+SCHEMA\b(?!\s+IF\s+NOT\s+EXISTS)/i) do |m|
+        "#{m} IF NOT EXISTS"
+      end
+    end
+    # `CREATE SEQUENCE [name]` → `CREATE SEQUENCE IF NOT EXISTS [name]`.
+    def add_if_not_exists_to_create_sequence(sql)
+      sql.gsub(/\bCREATE\s+SEQUENCE\b(?!\s+IF\s+NOT\s+EXISTS)/i) do |m|
+        "#{m} IF NOT EXISTS"
+      end
+    end
+    # `ALTER TABLE ... ADD CONSTRAINT ...;` is not idempotent on its own.
+    # PostgreSQL's PL/pgSQL has no IF-NOT-EXISTS clause for ADD CONSTRAINT, so wrap
+    # each statement in a DO block that swallows `duplicate_object`.
+    # Matches only statements whose ALTER TABLE clause leads directly into ADD CONSTRAINT
+    # (no intervening ALTER COLUMN / DROP / etc) so that unrelated ALTER TABLE statements
+    # in the same dump are not absorbed.
+    ADD_CONSTRAINT_RE = /^[ \t]*ALTER\s+TABLE\s+(?:ONLY\s+)?[^\s;,]+\s+(?:\n[ \t]*)?ADD\s+CONSTRAINT\b[^;]*;/m.freeze
+    def wrap_add_constraint_in_do_block(sql)
+      sql.gsub(ADD_CONSTRAINT_RE) do |stmt|
+        <<~SQL.chomp
+          DO $exwiw$ BEGIN
+            #{stmt.strip}
+          EXCEPTION WHEN duplicate_object THEN NULL;
+          END $exwiw$;
+        SQL
+      end
+    end
+  end
+end

data/lib/exwiw/runner.rb CHANGED Viewed

@@ -34,6 +34,11 @@ module Exwiw
         FileUtils.mkdir_p(@output_dir)
       end
+      ordered_tables = ordered_table_names.map { |n| table_by_name.fetch(n) }
+      schema_path = File.join(@output_dir, "insert-000-schema.#{adapter.schema_output_extension}")
+      @logger.info("Writing schema to #{schema_path}...")
+      adapter.dump_schema(ordered_tables, schema_path)
       total_size = ordered_table_names.size
       ordered_table_names.each_with_index do |table_name, idx|
         @logger.info("Processing table '#{table_name}'... (#{idx + 1}/#{total_size})")
@@ -57,6 +62,8 @@ module Exwiw
         insert_idx = (idx + 1).to_s.rjust(3, '0')
         File.open(File.join(@output_dir, "insert-#{insert_idx}-#{table_name}.#{adapter.output_extension}"), 'w') do |file|
           file.puts(insert_sql)
+          post = adapter.post_insert_sql(table)
+          file.puts(post) if post
         end
         if adapter.supports_bulk_delete?
@@ -81,14 +88,5 @@ module Exwiw
         klass.from(json)
       end
     end
-    private def build_adapter
-      case @connection_config["adapter"]
-      when "sqlite3"
-        Sqlite3Adapter.new(@connection_config)
-      else
-        raise "Unsupported adapter"
-      end
-    end
   end
 end

data/lib/exwiw/version.rb CHANGED Viewed

@@ -1,5 +1,5 @@
 # frozen_string_literal: true
 module Exwiw
-  VERSION = "0.1.7"
+  VERSION = "0.1.8"
 end

data/lib/exwiw.rb CHANGED Viewed

@@ -11,6 +11,7 @@ require_relative "exwiw/table_config"
 require_relative "exwiw/embedded_in"
 require_relative "exwiw/mongodb_field"
 require_relative "exwiw/mongodb_collection_config"
+require_relative "exwiw/ddl_postprocessor"
 require_relative "exwiw/adapter"
 require_relative "exwiw/adapter/sqlite3_adapter"
 require_relative "exwiw/adapter/mysql2_adapter"

data/mise.toml ADDED Viewed

@@ -0,0 +1,6 @@
+[env]
+# Prepend scenario/bin so `pg_dump` resolves to the wrapper that delegates to
+# the postgres container (compose.yml). exwiw's PostgreSQL adapter shells out
+# to pg_dump, which requires a server/client major-version match — the dev DB
+# is postgres:17 while host clients are often older (e.g. Homebrew pg14).
+_.path = ["./scenario/bin"]

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: exwiw
 version: !ruby/object:Gem::Version
-  version: 0.1.7
+  version: 0.1.8
 platform: ruby
 authors:
 - Shia
@@ -35,6 +35,8 @@ files:
 - CHANGELOG.md
 - LICENSE.txt
 - README.md
+- docs/plans/2026-05-15-insert-000-schema-file.md
+- docs/plans/2026-05-16-mongodb-from-clean-scenario.md
 - exe/exwiw
 - lib/exwiw.rb
 - lib/exwiw/adapter.rb
@@ -44,6 +46,7 @@ files:
 - lib/exwiw/adapter/sqlite3_adapter.rb
 - lib/exwiw/belongs_to.rb
 - lib/exwiw/cli.rb
+- lib/exwiw/ddl_postprocessor.rb
 - lib/exwiw/determine_table_processing_order.rb
 - lib/exwiw/embedded_in.rb
 - lib/exwiw/mongo_query.rb
@@ -58,6 +61,7 @@ files:
 - lib/exwiw/table_config.rb
 - lib/exwiw/version.rb
 - lib/tasks/exwiw.rake
+- mise.toml
 homepage: https://github.com/riseshia/exwiw
 licenses:
 - MIT
@@ -79,7 +83,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
     - !ruby/object:Gem::Version
       version: '0'
 requirements: []
-rubygems_version: 3.6.9
+rubygems_version: 4.0.10
 specification_version: 4
 summary: Ruby gem that allows you to export records from a database to a dump file.
 test_files: []