exwiw 0.4.10 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: a89e3da8899badc5bcccc98f58878744b9b106b31ca38af6a70b41109002d126
4
- data.tar.gz: 13dd0757b0699c7865cd4c89b214288d16d017a9447c042e90c3da5cedc6f4b2
3
+ metadata.gz: 224bdc1d3b0f94e08463ad9e42a6e67d0592d902388b5873f5840226dbdbd3fe
4
+ data.tar.gz: de9ddd4a625565e0bcd28ff3f74df8da06092c443ad1f170d41c5858a24c4802
5
5
  SHA512:
6
- metadata.gz: b04b7e070c1215b24c6dfa6453e572bf560d77fbd9a6eb6172fbd301a8b662ebc1da815f13d5fb4ba31799a5e8e45c610c0271e810eff6c8c675e75fcf261e2d
7
- data.tar.gz: 4b88cfc7ed7a758b7a82b76d9e936c251b388825411fef9c305434919b7695f6866e5011828a6a2bf00c95692d157c1f0e4e768e0dc3e9ab79b980f025e4c626
6
+ metadata.gz: '08f564c07c09561a4b9b825bb7f6ca43a076df5b8262f165addad471639084fa5b5074330215edd36951ce0c427f510533f39d03e0632c1306ba4e9054391b33'
7
+ data.tar.gz: 2c161f236a676a15774fb097a7a2c4d66f95f38be8c465dfc29788b4c45165aa3b13dee2b1da771e317672e63f089d2496e10cf4fe2ebf16f97885e0e1c49c76
data/CHANGELOG.md CHANGED
@@ -2,6 +2,28 @@
2
2
 
3
3
  ## [Unreleased]
4
4
 
5
+ ## [0.5.0] - 2026-06-16
6
+
7
+ ### Added
8
+
9
+ - A YAML **config file** (`exwiw.yml`) can now hold any option except the database connection settings, so they no longer have to be repeated on every invocation. Pass it with `--config=PATH`; when `--config` is omitted, `exwiw.yml` (or `exwiw.yaml`) is loaded automatically from the current directory if present. **Options passed on the CLI take precedence** over the file (the file only fills in options not given on the CLI). Connection settings — `host`, `port`, `user`, `database`, `uri`, `password` — are **rejected** in the file (they must come from the CLI/environment); `adapter` is the one connection-related key allowed. Relative paths in the file (`schema_dir`, `output_dir`, `after_insert_hook`) are resolved relative to the config file's own directory (so a root-level `exwiw.yml` with `schema_dir: exwiw/schema` reads naturally and an absolute `--config` works from any directory). Unknown keys are rejected to catch typos, and export-only keys (`output_dir`, `output_format`, `insert_only`, `after_insert_hook`) are ignored under `explain` so one file can be shared by both subcommands.
10
+
11
+ ### Changed
12
+
13
+ - **BREAKING**: the `export`/`explain` CLI option `--config-dir` has been renamed to `--schema-dir` to distinguish the directory of schema JSON files from the new `--config` config file. Its short form `-c` is now `--config` (the config file); `--schema-dir` has no short form. The hook contract is renamed to match: the shell-hook environment variable `EXWIW_CONFIG_DIR` is now `EXWIW_SCHEMA_DIR`, and the Ruby-hook `cli_options[:config_dir]` is now `cli_options[:schema_dir]`. Update invocations, scripts, and hooks accordingly (`--config-dir` no longer exists). `--schema-dir` is still required and has no default unless `schema_dir` is set in the config file.
14
+ - **BREAKING**: the env var that overrides where `schema:generate`, `schema:tidy`, and `schema:generate_mongoid` write their config has been renamed from `OUTPUT_DIR_PATH` to `EXWIW_SCHEMA_DIR_PATH`, and the default output directory is now `exwiw/schema` (previously `exwiw`). The new name disambiguates it from the dump-side `--output-dir`, and the dedicated `schema/` subdirectory leaves `exwiw/` free for other artifacts (hooks, dumps). `OUTPUT_DIR_PATH` is no longer read. Existing repositories should set `EXWIW_SCHEMA_DIR_PATH` (e.g. `EXWIW_SCHEMA_DIR_PATH=exwiw` to preserve the old flat layout) and/or move their config under `exwiw/schema/`; otherwise a `generate` run will write a fresh copy into `exwiw/schema/` and leave the old files stale. The `export`/`explain` CLI is unaffected, but examples now point at `exwiw/schema`.
15
+
16
+ ## [0.4.11] - 2026-06-15
17
+
18
+ ### Fixed
19
+
20
+ - MongoDB: `schema:generate_mongoid` now resolves an embedded collection's `embedded_in` document key by locating the parent's `embeds_one` / `embeds_many` that stores the collection, instead of trusting Mongoid's computed `assoc.inverse`. Mongoid returns a `nil` inverse for many valid embeddings (when no explicit `inverse_of:` is declared and it declines to infer one), and previously such a model was wrongly reported as having an "unresolvable inverse" and skipped (or aborted the run). Matching by the embedded collection also resolves an STI subclass embedded through a relation declared against its base class. Genuinely ambiguous embeddings (the same collection stored under several keys in the parent) are still reported as unrepresentable.
21
+
22
+ ### Added
23
+
24
+ - MongoDB: `schema:generate_mongoid` now **honors an explicit `ignore: true` on disk** and skips re-introspecting it, so a construct exwiw cannot represent that you have already triaged no longer aborts the (fail-loud, default) run — and the annotation survives regeneration. Works at two granularities: a whole **collection** marked `ignore: true` is preserved as-is without introspection, and a single **`belongs_to`** marked `ignore: true` (no `table_name` required) is preserved while the rest of its collection still generates and dumps (its foreign-key column stays an ordinary field). This lets you keep the generator strict by default while opting individual stale/unrepresentable constructs out by hand, rather than relying on `EXWIW_SKIP_UNSUPPORTED`.
25
+ - `MongodbCollectionConfig` and `BelongsTo` gain an optional, user-owned **`ignore_type`** tag (free-form; exwiw never interprets or emits it) to record *why* something is ignored — e.g. `"need_code_fix"` for an application-side bug, `"unsupported"` for a shape exwiw cannot express. Preserved across regeneration like `comment`. `BelongsTo#table_name` is now optional so an ignored, no-longer-resolvable relation can be recorded without a target collection (a non-ignored `belongs_to` still requires one).
26
+
5
27
  ## [0.4.10] - 2026-06-12
6
28
 
7
29
  ### Fixed
data/README.md CHANGED
@@ -72,7 +72,7 @@ exwiw \
72
72
  --port=3306 \
73
73
  --user=reader \
74
74
  --database=app_production \
75
- --config-dir=exwiw \
75
+ --schema-dir=exwiw/schema \
76
76
  --target-table=shops \
77
77
  --ids=1 \ # comma separated ids
78
78
  --output-dir=dump \
@@ -81,7 +81,7 @@ exwiw \
81
81
 
82
82
  By default `--ids` are matched against the target table's primary key. `--ids-column=COLUMN` matches them against a different column instead (e.g. `--target-table=users --ids=alice@example.com --ids-column=email`). Related tables are still extracted correctly: their foreign keys are resolved through the target via a subquery (`WHERE fk IN (SELECT pk FROM target WHERE COLUMN IN (...))`), so only the target table's filter column changes. This is the SQL-adapter counterpart of the mongodb `--ids-field`; the two are mutually exclusive and each is rejected by the other adapter family. Note: if `COLUMN` is itself masked, re-running `delete-*` against an already-imported (masked) dump won't match, so prefer a stable natural key.
83
83
 
84
- When `--target-table` and `--ids` are omitted, exwiw dumps all tables defined in `--config-dir`:
84
+ When `--target-table` and `--ids` are omitted, exwiw dumps all tables defined in `--schema-dir`:
85
85
 
86
86
  ```bash
87
87
  # dump all tables
@@ -91,7 +91,7 @@ exwiw \
91
91
  --port=5432 \
92
92
  --user=reader \
93
93
  --database=app_production \
94
- --config-dir=exwiw \
94
+ --schema-dir=exwiw/schema \
95
95
  --output-dir=dump
96
96
  ```
97
97
 
@@ -123,25 +123,58 @@ exwiw explain \
123
123
  --adapter=postgresql \
124
124
  --host=localhost --port=5432 --user=reader \
125
125
  --database=app_production \
126
- --config-dir=exwiw \
126
+ --schema-dir=exwiw/schema \
127
127
  --target-table=shops --ids=1
128
128
  ```
129
129
 
130
130
  The `--output-dir`, `--output-format`, `--insert-only`, and `--after-insert-hook` options are dump-specific and rejected when used with `explain`.
131
131
 
132
+ ### Config file (`exwiw.yml`)
133
+
134
+ Options you would otherwise repeat on every run can be kept in a YAML config file. Pass it with `--config=PATH`; when `--config` is omitted, exwiw automatically loads `exwiw.yml` (or `exwiw.yaml`) from the current directory if present.
135
+
136
+ **Options passed on the CLI always take precedence over the config file** — the config only fills in options you did not pass. This lets you commit the stable settings (which schema to read, output format, ...) while still varying the environment-specific connection details per invocation.
137
+
138
+ ```yaml
139
+ # exwiw.yml — keep at the project root, alongside exwiw/schema/
140
+ adapter: postgresql
141
+ schema_dir: exwiw/schema
142
+ output_dir: dump
143
+ output_format: insert # insert | copy
144
+ insert_only: false
145
+ after_insert_hook: hooks/seed.rb
146
+ log_level: info # debug | info
147
+ # target_table / ids / ids_field / ids_column may also be set here
148
+ ```
149
+
150
+ With the file above, only the connection details need to be supplied on the CLI:
151
+
152
+ ```bash
153
+ DATABASE_PASSWORD=... exwiw \
154
+ --host=localhost --port=5432 --user=reader --database=app_production \
155
+ --target-table=shops --ids=1
156
+ ```
157
+
158
+ Notes:
159
+
160
+ - **Database connection settings stay on the CLI/environment.** `host`, `port`, `user`, `database`, `uri`, and `password` are **rejected** in the config file (exwiw exits with an error). `adapter` is the one connection-related key that *is* allowed in the file.
161
+ - **Relative paths in the config (`schema_dir`, `output_dir`, `after_insert_hook`) are resolved relative to the config file's own directory**, not the current working directory. So with the config at the project root, `schema_dir: exwiw/schema` reads naturally, and an absolute `--config=/path/to/exwiw.yml` works no matter where you run from. (CLI path flags remain relative to the current directory — each source resolves relative to where it is written.) Absolute paths are used as-is.
162
+ - Unknown keys are rejected so a typo surfaces immediately.
163
+ - Export-only keys (`output_dir`, `output_format`, `insert_only`, `after_insert_hook`) are ignored when running `explain`, so a single config file can be shared by both subcommands.
164
+
132
165
  ### Generator
133
166
 
134
167
  The config generator is provided as a Rake task.
135
168
 
136
169
  ```bash
137
- # generate table schema under exwiw/
170
+ # generate table schema under exwiw/schema/
138
171
  bundle exec rake exwiw:schema:generate
139
172
  ```
140
173
 
141
- By default, the schema files will be saved in the `exwiw` directory. You can specify a different output directory by setting the `OUTPUT_DIR_PATH` environment variable:
174
+ By default, the schema files will be saved in the `exwiw/schema` directory. You can specify a different output directory by setting the `EXWIW_SCHEMA_DIR_PATH` environment variable:
142
175
 
143
176
  ```sh
144
- OUTPUT_DIR_PATH=custom_directory bundle exec rake exwiw:schema:generate
177
+ EXWIW_SCHEMA_DIR_PATH=custom_directory bundle exec rake exwiw:schema:generate
145
178
  ```
146
179
 
147
180
  #### Tidying stale config (`schema:tidy`)
@@ -159,14 +192,14 @@ bundle exec rake exwiw:schema:tidy
159
192
 
160
193
  Because it reads the database directly, a table that still exists in the database but has lost (or never had) an ActiveRecord model is **kept** — only a table that is genuinely gone is removed. (This is the deliberate counterpart to `generate`, which is model-driven and only ever adds what the models know about.)
161
194
 
162
- It respects `OUTPUT_DIR_PATH` and the per-database subdirectory layout in the same way as `schema:generate`. Unlike `generate`, `tidy` never adds or regenerates entries — every surviving table/column (including hand-edited `comment` / `ignore` / `replace_with`) is left untouched, so it is safe to run on a customized config. The task prints which tables and columns it removed (or that the config was already tidy). Stale `belongs_tos` are not pruned by `tidy`; rerun `schema:generate` to refresh those.
195
+ It respects `EXWIW_SCHEMA_DIR_PATH` and the per-database subdirectory layout in the same way as `schema:generate`. Unlike `generate`, `tidy` never adds or regenerates entries — every surviving table/column (including hand-edited `comment` / `ignore` / `replace_with`) is left untouched, so it is safe to run on a customized config. The task prints which tables and columns it removed (or that the config was already tidy). Stale `belongs_tos` are not pruned by `tidy`; rerun `schema:generate` to refresh those.
163
196
 
164
197
  #### Multiple databases
165
198
 
166
199
  If the application uses Rails' multiple-database support (`connects_to`), `schema:generate` buckets models by the database they connect to and writes each database's config files into its own subdirectory of the output directory, named after the database config name (`primary`, `analytics`, ...):
167
200
 
168
201
  ```
169
- exwiw/
202
+ exwiw/schema/
170
203
  primary/
171
204
  shops.json
172
205
  users.json
@@ -194,20 +227,53 @@ It is a distinct task and class (`Exwiw::MongoidSchemaGenerator`) from the Activ
194
227
  - the collection name and the `_id` primary key,
195
228
  - `fields` from the declared Mongoid fields (referenced `belongs_to` foreign keys such as `shop_id`, and the `created_at` / `updated_at` columns added by `Mongoid::Timestamps`, are ordinary fields — their BSON `ObjectId` / `Date` values serialize as MongoDB Extended JSON at dump time). For an aliased field (`field :ctry, as: :country`), the generator emits the **stored** document key (`ctry`), never the Ruby accessor (`country`), so masking and projection target the key that actually appears in the document, and additionally records the accessor as `mongoid_field_name` on that field so the short key stays understandable (association aliases such as `shop => shop_id` and the built-in `id => _id` are not field renames and are not annotated),
196
229
  - `belongs_tos` from referenced `belongs_to` associations (`{ table_name, foreign_key }`). A referenced `belongs_to` declared on an *embedded* document is dropped (cross-collection refs from inside embedded subdocuments are unsupported — see [MongoDB notes](#mongodb-notes)), but its foreign-key column is still kept as an ordinary field. A `has_and_belongs_to_many` association is also dropped (its foreign keys are stored as an array field, e.g. `tag_ids`, which exwiw cannot follow as a single-valued foreign key), while that `*_ids` array column is kept as an ordinary field,
197
- - `embedded_in` from `embedded_in` / `embeds_many` / `embeds_one` associations. Each embedded config names its *immediate* parent collection and the document key it lives under (`store_as`, defaulting to the relation name); nested embedding is represented as a chain (`comments` → `embedded_in` `posts`, `posts` → `embedded_in` `users`) rather than a flattened dot-path, matching how the adapter recurses through array and Hash subdocuments. A *polymorphic* `embedded_in` (`embedded_in :addressable, polymorphic: true`) has no single embedding parent collection and so cannot be expressed as an `embedded_in` config; the generator raises a clear error pointing you to define that collection's config by hand. A *self-referential / cyclic* embedding (Mongoid's `recursively_embeds_many` / `recursively_embeds_one`) makes a collection both a top-level document and embedded inside documents of its own type; exwiw represents a collection as either top-level or embedded, not both, so the generator likewise raises a clear error rather than emit a config that would silently make the collection undumpable.
230
+ - `embedded_in` from `embedded_in` / `embeds_many` / `embeds_one` associations. Each embedded config names its *immediate* parent collection and the document key it lives under (`store_as`, defaulting to the relation name); nested embedding is represented as a chain (`comments` → `embedded_in` `posts`, `posts` → `embedded_in` `users`) rather than a flattened dot-path, matching how the adapter recurses through array and Hash subdocuments. The document key is resolved by locating the parent's `embeds_one` / `embeds_many` that stores this collection. (Mongoid's computed inverse is frequently `nil` when no explicit `inverse_of:` is set, so exwiw matches by the collection the parent's embedding relations store rather than trusting that inverse — this also resolves an STI subclass embedded through a relation declared against its base class.) When the same collection is embedded under several keys in the parent, the path is ambiguous and treated as unrepresentable (see below). A *polymorphic* `embedded_in` (`embedded_in :addressable, polymorphic: true`) has no single embedding parent collection and so cannot be expressed as an `embedded_in` config. A *self-referential / cyclic* embedding (Mongoid's `recursively_embeds_many` / `recursively_embeds_one`) makes a collection both a top-level document and embedded inside documents of its own type; exwiw represents a collection as either top-level or embedded, not both, so it cannot emit an `embedded_in` config that would silently make the collection undumpable. These unrepresentable shapes are handled best-effort by default and abort only in strict mode (see below).
198
231
 
199
232
  Models in an inheritance hierarchy whose subclasses share the base's collection (Mongoid STI, distinguished by the auto-added `_type` discriminator) collapse into a single config: the generator discovers the subclasses via `descendants` (Mongoid registers only the base class in `Mongoid.models`) and unions every class's `fields` and `belongs_tos` into the collection config, so subclass-only fields and associations are not lost.
200
233
 
201
234
  Regeneration preserves hand-edited `replace_with`, `filter`, `ignore`, and `bulk_insert_chunk_size` values, like the ActiveRecord generator. Indexes are not written to the config — they are introspected from the live database at dump time (see [MongoDB notes](#mongodb-notes)). Polymorphic `belongs_to` is not yet expanded by this task.
202
235
 
203
- By default the task **aborts** when a model uses a construct exwiw cannot represent: a `belongs_to` whose target class can no longer be resolved (a stale relation left behind after its model was removed), or a polymorphic / self-referential-cyclic / unresolvable-parent `embedded_in` (see the cases above). Set `EXWIW_SKIP_UNSUPPORTED=1` to keep going instead:
236
+ By default the task **aborts** when a model uses a construct exwiw cannot represent: a `belongs_to` whose target class can no longer be resolved (a stale relation left behind after its model was removed), or a polymorphic / self-referential-cyclic / ambiguous / unresolvable-parent `embedded_in` (see the cases above).
237
+
238
+ #### Honoring an explicit `ignore` (the recommended way to keep these out)
239
+
240
+ When you have reviewed such a construct and decided exwiw should leave it alone, mark it `ignore: true` in its config on disk. The generator **honors an explicit `ignore` and skips re-introspecting it**, so it never aborts the run on something you have already triaged — and your annotation survives regeneration. Two granularities:
241
+
242
+ - A whole **collection** exwiw cannot represent (e.g. a polymorphic / ambiguous `embedded_in`) — mark the collection config `"ignore": true`. To actually dump/mask it later, define its `embedded_in` config by hand (see [Embedded documents](#embedded-documents)).
243
+ - A single **`belongs_to`** that no longer resolves while the rest of its collection is fine (e.g. a stale relation pointing at a removed model) — mark that entry `"ignore": true`, with no `table_name`. The relation is dropped from extraction (`#reject_ignored_members!`) while its foreign-key column stays an ordinary field, and the collection keeps dumping.
244
+
245
+ Record *why* with the optional **`ignore_type`** (a free-form tag exwiw never interprets — e.g. `"need_code_fix"` for an application-side bug, `"unsupported"` for a shape exwiw cannot express) and a **`comment`**. Both are user-owned and preserved across regeneration; the generator never emits `ignore_type` itself.
246
+
247
+ ```json
248
+ // orders.json — a stale belongs_to flagged for a code fix; the collection still dumps
249
+ {
250
+ "name": "orders",
251
+ "primary_key": "_id",
252
+ "belongs_to": [
253
+ { "table_name": "shops", "foreign_key": "shop_id" },
254
+ {
255
+ "foreign_key": "coupon_id",
256
+ "ignore": true,
257
+ "ignore_type": "need_code_fix",
258
+ "comment": "FIXME: belongs_to :coupon -> Coupon does not exist (dead relation)."
259
+ }
260
+ ],
261
+ "fields": [ /* ... coupon_id is kept as an ordinary field ... */ ]
262
+ }
263
+ ```
264
+
265
+ #### First bootstrap pass: `EXWIW_SKIP_UNSUPPORTED=1`
266
+
267
+ For the very first pass against a large app — before any `ignore` annotations exist — set `EXWIW_SKIP_UNSUPPORTED=1` to keep going past *un-annotated* unrepresentable constructs instead of aborting one at a time:
204
268
 
205
269
  ```bash
206
270
  EXWIW_SKIP_UNSUPPORTED=1 bundle exec rake exwiw:schema:generate_mongoid
207
271
  ```
208
272
 
209
273
  - An unresolvable `belongs_to` is dropped from the collection's `belongs_tos` (its foreign-key column is still kept as an ordinary field, like the polymorphic / HABTM cases) and a warning naming the relation is printed to stderr.
210
- - An unrepresentable `embedded_in` collection is emitted as a **top-level** config marked `"ignore": true` with a `comment` recording why, and a warning is printed. `ignore: true` keeps it out of extraction so it is not wrongly dumped as its own collection; to actually dump/mask such an embedded collection, define its `embedded_in` config by hand (see [Embedded documents](#embedded-documents)). This is useful for bootstrapping a config against a large app where a handful of legacy/polymorphic collections would otherwise block the whole run.
274
+ - An unrepresentable `embedded_in` collection is emitted as a **top-level** config marked `"ignore": true` with a `comment` recording why, and a warning is printed.
275
+
276
+ Review the stderr warnings, annotate the affected configs (`ignore` / `ignore_type` / `comment`), and subsequent runs complete without the flag because the generator honors those explicit ignores.
211
277
 
212
278
  ### Configuration
213
279
 
@@ -234,7 +300,7 @@ This is an example of the one table schema:
234
300
  }
235
301
  ```
236
302
 
237
- `--config-dir` will use all json files in the specified directory.
303
+ `--schema-dir` will use all json files in the specified directory.
238
304
 
239
305
  ### Output format
240
306
 
@@ -274,7 +340,7 @@ SQL
274
340
 
275
341
  **Shell hook**: anything other than `.rb` is exec'd as a child process. It is a pure side-effect hook — exwiw does not capture its stdout. The hook receives these env vars and inherits `DATABASE_PASSWORD` from the parent:
276
342
 
277
- - `EXWIW_OUTPUT_DIR`, `EXWIW_CONFIG_DIR`
343
+ - `EXWIW_OUTPUT_DIR`, `EXWIW_SCHEMA_DIR`
278
344
  - `EXWIW_DATABASE_ADAPTER`, `EXWIW_DATABASE_HOST`, `EXWIW_DATABASE_PORT`, `EXWIW_DATABASE_USER`, `EXWIW_DATABASE_NAME`
279
345
  - `EXWIW_TARGET_TABLE`, `EXWIW_IDS` (comma-separated), `EXWIW_OUTPUT_FORMAT`
280
346
 
@@ -31,7 +31,7 @@ module Exwiw
31
31
  def self.run_shell(path:, cli_options:, output_dir:, logger:)
32
32
  env = {
33
33
  'EXWIW_OUTPUT_DIR' => output_dir,
34
- 'EXWIW_CONFIG_DIR' => cli_options[:config_dir].to_s,
34
+ 'EXWIW_SCHEMA_DIR' => cli_options[:schema_dir].to_s,
35
35
  'EXWIW_DATABASE_ADAPTER' => cli_options[:database_adapter].to_s,
36
36
  'EXWIW_DATABASE_HOST' => cli_options[:database_host].to_s,
37
37
  'EXWIW_DATABASE_PORT' => cli_options[:database_port].to_s,
@@ -5,7 +5,11 @@ module Exwiw
5
5
  include Serdes
6
6
 
7
7
  attribute :foreign_key, String
8
- attribute :table_name, String
8
+ # Optional so an ignored, no-longer-resolvable relation (a stale
9
+ # `belongs_to` whose target class is gone) can be recorded with no target
10
+ # collection. A non-ignored belongs_to still requires it — enforced by the
11
+ # owning config's validation (e.g. MongodbCollectionConfig#validate_belongs_tos!).
12
+ attribute :table_name, optional(String), skip_serializing_if_nil: true
9
13
  # Set only for a polymorphic association. `foreign_type` is the name of the
10
14
  # column storing the type (e.g. `reviewable_type`), and `type_value` is the
11
15
  # value held in that column (e.g. `"Product"`). Both are nil for a
@@ -32,6 +36,11 @@ module Exwiw
32
36
  # extraction once the config is loaded (see #reject_ignored_members!).
33
37
  attribute :comment, optional(String), skip_serializing_if_nil: true
34
38
  attribute :ignore, Serdes::OptionalType.new(Serdes::ConcreteType.new(Boolean)), skip_serializing_if_nil: true
39
+ # Free-form tag recording *why* this relation is ignored (e.g.
40
+ # "need_code_fix" for an application-side bug, "unsupported" for a shape
41
+ # exwiw cannot express). exwiw never interprets or emits it; purely
42
+ # informational and preserved across regeneration like `comment`.
43
+ attribute :ignore_type, optional(String), skip_serializing_if_nil: true
35
44
 
36
45
  def self.from_symbol_keys(hash)
37
46
  from(hash.transform_keys(&:to_s))
data/lib/exwiw/cli.rb CHANGED
@@ -5,6 +5,7 @@ require 'optparse'
5
5
  require 'pathname'
6
6
 
7
7
  require 'json'
8
+ require 'yaml'
8
9
 
9
10
  require 'exwiw'
10
11
 
@@ -12,6 +13,39 @@ module Exwiw
12
13
  class CLI
13
14
  KNOWN_SUBCOMMANDS = %w[export explain].freeze
14
15
 
16
+ # Config file loaded automatically when --config is omitted, if one exists in
17
+ # the current directory. Kept at the project root (rather than under exwiw/)
18
+ # so that config-relative paths like `schema_dir: exwiw/schema` read naturally.
19
+ # Both extensions are accepted; .yml wins when both are present.
20
+ DEFAULT_CONFIG_PATHS = %w[exwiw.yml exwiw.yaml].freeze
21
+
22
+ # Keys accepted in the config file. Anything outside this set is rejected so
23
+ # a typo surfaces immediately instead of being silently ignored. These mirror
24
+ # the non-connection CLI options (plus `adapter`).
25
+ ALLOWED_CONFIG_KEYS = %w[
26
+ adapter
27
+ schema_dir
28
+ output_dir
29
+ output_format
30
+ insert_only
31
+ after_insert_hook
32
+ log_level
33
+ target_table
34
+ target_collection
35
+ ids
36
+ ids_field
37
+ ids_column
38
+ ].freeze
39
+
40
+ # Database connection settings are environment-specific (and sometimes
41
+ # secret-adjacent), so they must be passed via CLI/env, never the committed
42
+ # config file. `adapter` is the one connection-ish key allowed in config.
43
+ REJECTED_CONNECTION_KEYS = %w[host port user database uri password].freeze
44
+
45
+ # Keys that only make sense for `export`. They are skipped when merging config
46
+ # for `explain` so a shared config file does not trip validate_explain_only!.
47
+ EXPORT_ONLY_CONFIG_KEYS = %w[output_dir output_format insert_only after_insert_hook].freeze
48
+
15
49
  def self.start(argv)
16
50
  new(argv).run
17
51
  end
@@ -34,7 +68,8 @@ module Exwiw
34
68
  @database_password = ENV["DATABASE_PASSWORD"]
35
69
  @connection_uri = nil
36
70
  @output_dir = nil
37
- @config_dir = nil
71
+ @schema_dir = nil
72
+ @config_file_path = nil
38
73
  @database_adapter = nil
39
74
  @database_name = nil
40
75
  @target_table_name = nil
@@ -45,7 +80,9 @@ module Exwiw
45
80
  @output_format = nil
46
81
  @insert_only = nil
47
82
  @after_insert_hook_path = nil
48
- @log_level = :info
83
+ # nil (not :info) so we can tell "user passed --log-level" from the default,
84
+ # letting a config-file value fill in; the :info default is applied later.
85
+ @log_level = nil
49
86
 
50
87
  parser.parse!(@argv)
51
88
  end
@@ -82,7 +119,7 @@ module Exwiw
82
119
  Runner.new(
83
120
  connection_config: connection_config,
84
121
  output_dir: @output_dir,
85
- config_dir: @config_dir,
122
+ schema_dir: @schema_dir,
86
123
  dump_target: dump_target,
87
124
  output_format: @output_format,
88
125
  insert_only: @insert_only,
@@ -93,7 +130,7 @@ module Exwiw
93
130
  when "explain"
94
131
  ExplainRunner.new(
95
132
  connection_config: connection_config,
96
- config_dir: @config_dir,
133
+ schema_dir: @schema_dir,
97
134
  dump_target: dump_target,
98
135
  logger: logger,
99
136
  io: $stdout,
@@ -102,6 +139,14 @@ module Exwiw
102
139
  end
103
140
 
104
141
  private def validate_options!
142
+ # Fill in any options not given on the CLI from the config file. Done first
143
+ # so a config-provided `adapter` is in place before normalization below.
144
+ # CLI values always win (the merge only fills nil/empty ivars).
145
+ apply_config_file!
146
+
147
+ # Default log level once CLI and config have both had their say.
148
+ @log_level ||= :info
149
+
105
150
  # Fold driver/Rails adapter spellings (mysql2, sqlite3) into exwiw's
106
151
  # canonical names up front, so every check below — and the
107
152
  # EXWIW_DATABASE_ADAPTER passed to hooks — sees the canonical name.
@@ -163,18 +208,18 @@ module Exwiw
163
208
  end
164
209
  end
165
210
 
166
- if @config_dir.nil?
167
- $stderr.puts "Config dir is required"
211
+ if @schema_dir.nil?
212
+ $stderr.puts "Schema dir is required (pass --schema-dir or set schema_dir in the config file)"
168
213
  exit 1
169
214
  end
170
215
 
171
- unless Dir.exist?(@config_dir)
172
- $stderr.puts "Config dir does not exist: #{@config_dir}"
216
+ unless Dir.exist?(@schema_dir)
217
+ $stderr.puts "Schema dir does not exist: #{@schema_dir}"
173
218
  exit 1
174
219
  end
175
220
 
176
- if Dir.glob(File.join(@config_dir, "*.json")).empty?
177
- $stderr.puts "Config dir contains no .json files: #{@config_dir}"
221
+ if Dir.glob(File.join(@schema_dir, "*.json")).empty?
222
+ $stderr.puts "Schema dir contains no .json files: #{@schema_dir}"
178
223
  exit 1
179
224
  end
180
225
 
@@ -202,6 +247,78 @@ module Exwiw
202
247
  end
203
248
  end
204
249
 
250
+ # Merge settings from the config file (YAML) into any options the user did
251
+ # not pass on the CLI. The CLI always wins: every assignment below only fills
252
+ # an ivar that is still nil/empty after parsing ARGV. Connection settings
253
+ # (except `adapter`) are rejected here — they belong on the CLI/env.
254
+ private def apply_config_file!
255
+ path =
256
+ if @config_file_path
257
+ unless File.file?(@config_file_path)
258
+ $stderr.puts "Config file not found: #{@config_file_path}"
259
+ exit 1
260
+ end
261
+ @config_file_path
262
+ else
263
+ DEFAULT_CONFIG_PATHS.map { |p| File.expand_path(p) }.find { |p| File.file?(p) }
264
+ end
265
+ return if path.nil?
266
+
267
+ # Paths inside the config file are resolved relative to the file's own
268
+ # directory (not cwd), so `schema_dir: exwiw/schema` reads naturally with the
269
+ # config kept at the project root, and an absolute --config works from any
270
+ # cwd. (CLI path flags stay cwd-relative — each source resolves relative to
271
+ # where it is written.) `path` is always absolute here.
272
+ base = File.dirname(path)
273
+
274
+ config = YAML.safe_load(File.read(path)) || {}
275
+ unless config.is_a?(Hash)
276
+ $stderr.puts "Config file must be a YAML mapping (key: value): #{path}"
277
+ exit 1
278
+ end
279
+
280
+ config.each_key do |key|
281
+ if REJECTED_CONNECTION_KEYS.include?(key)
282
+ $stderr.puts "'#{key}' is a database connection setting and must be passed via the CLI/environment, not the config file (#{path})"
283
+ exit 1
284
+ end
285
+ unless ALLOWED_CONFIG_KEYS.include?(key)
286
+ $stderr.puts "Unknown config key '#{key}' in #{path}. Allowed keys: #{ALLOWED_CONFIG_KEYS.join(', ')}"
287
+ exit 1
288
+ end
289
+ end
290
+
291
+ # For `explain`, drop export-only keys so a config shared with `export`
292
+ # does not make validate_explain_only! reject the run.
293
+ config = config.reject { |k, _| EXPORT_ONLY_CONFIG_KEYS.include?(k) } if @subcommand == "explain"
294
+
295
+ @database_adapter ||= config["adapter"]
296
+ @schema_dir ||= expand_dir(config["schema_dir"], base)
297
+ @output_dir ||= expand_dir(config["output_dir"], base)
298
+ @after_insert_hook_path ||= (File.expand_path(config["after_insert_hook"], base) if config["after_insert_hook"])
299
+ @output_format ||= config["output_format"]
300
+ @insert_only = config["insert_only"] if @insert_only.nil? && config.key?("insert_only")
301
+ @log_level ||= config["log_level"]&.to_sym
302
+ @target_table_name ||= config["target_table"]
303
+ @target_collection_name ||= config["target_collection"]
304
+ if @ids.empty? && config.key?("ids")
305
+ raw = config["ids"]
306
+ # Accept either a YAML list or a "1,2" string; coerce to strings to match
307
+ # the CLI's `--ids=1,2` -> ["1", "2"] shape.
308
+ @ids = (raw.is_a?(String) ? raw.split(",") : Array(raw)).map(&:to_s)
309
+ end
310
+ @ids_field ||= config["ids_field"]
311
+ @ids_column ||= config["ids_column"]
312
+ end
313
+
314
+ # Strip a trailing slash (like the CLI's dir options) and expand relative to
315
+ # `base` (the config file's directory). Returns nil for a nil value.
316
+ private def expand_dir(value, base)
317
+ return nil if value.nil?
318
+ value = value.end_with?("/") ? value[0..-2] : value
319
+ File.expand_path(value, base)
320
+ end
321
+
205
322
  # `--target-collection` is a mongodb-only alias of `--target-table`. Fold it
206
323
  # into @target_table_name (the single field the rest of the CLI/runner uses)
207
324
  # after rejecting the misuses: combining it with --target-table, or using it
@@ -319,7 +436,7 @@ module Exwiw
319
436
  database_user: @database_user,
320
437
  database_password: @database_password,
321
438
  output_dir: @output_dir,
322
- config_dir: @config_dir,
439
+ schema_dir: @schema_dir,
323
440
  database_adapter: @database_adapter,
324
441
  database_name: @database_name,
325
442
  target_table: @target_table_name,
@@ -368,9 +485,12 @@ module Exwiw
368
485
  v = v.end_with?("/") ? v[0..-2] : v
369
486
  @output_dir = File.expand_path(v)
370
487
  end
371
- opts.on("-c", "--config-dir=CONFIG_DIR_PATH", "Config dir path.") do |v|
488
+ opts.on("--schema-dir=SCHEMA_DIR_PATH", "Directory of schema JSON files. (or set schema_dir in the config file)") do |v|
372
489
  v = v.end_with?("/") ? v[0..-2] : v
373
- @config_dir = File.expand_path(v)
490
+ @schema_dir = File.expand_path(v)
491
+ end
492
+ opts.on("-c", "--config=CONFIG_FILE_PATH", "Path to the exwiw config YAML. Defaults to ./#{DEFAULT_CONFIG_PATHS.first} (or .#{File.extname(DEFAULT_CONFIG_PATHS.last)}) when present. CLI options take precedence; paths inside the file are resolved relative to the file.") do |v|
493
+ @config_file_path = File.expand_path(v)
374
494
  end
375
495
  opts.on("-a", "--adapter=ADAPTER", "Database adapter: mysql, sqlite, postgresql, mongodb (aliases: mysql2, sqlite3)") { |v| @database_adapter = v }
376
496
  opts.on("--uri=URI", "Full MongoDB connection URI (mongodb:// or mongodb+srv://). mongodb adapter only; takes precedence over --host/--port/--user. TLS, replicaSet, authSource and credentials are read from the URI.") { |v| @connection_uri = v }
@@ -4,13 +4,13 @@ module Exwiw
4
4
  class ExplainRunner
5
5
  def initialize(
6
6
  connection_config:,
7
- config_dir:,
7
+ schema_dir:,
8
8
  dump_target:,
9
9
  logger:,
10
10
  io: $stdout
11
11
  )
12
12
  @connection_config = connection_config
13
- @config_dir = config_dir
13
+ @schema_dir = schema_dir
14
14
  @dump_target = dump_target
15
15
  @logger = logger
16
16
  @io = io
@@ -53,7 +53,7 @@ module Exwiw
53
53
  end
54
54
 
55
55
  private def load_table_config(klass)
56
- Dir[File.join(@config_dir, "*.json")].map do |file|
56
+ Dir[File.join(@schema_dir, "*.json")].map do |file|
57
57
  json = JSON.parse(File.read(file))
58
58
  klass.from(json).reject_ignored_members!
59
59
  end
@@ -20,6 +20,11 @@ module Exwiw
20
20
  # marks an unrepresentable collection `ignore: true`, to record why extraction
21
21
  # was skipped.
22
22
  attribute :comment, optional(String), skip_serializing_if_nil: true
23
+ # Free-form tag recording *why* this collection is ignored (e.g.
24
+ # "need_code_fix" for an application-side bug, "unsupported" for a shape
25
+ # exwiw cannot express). exwiw never interprets or emits it; informational
26
+ # and preserved across regeneration like `comment`.
27
+ attribute :ignore_type, optional(String), skip_serializing_if_nil: true
23
28
 
24
29
  # Marks this config as physically embedded inside another collection's
25
30
  # documents. When set, this config is not processed as a standalone dump
@@ -30,6 +35,7 @@ module Exwiw
30
35
  def self.from(obj)
31
36
  instance = super
32
37
  instance.__send__(:validate_embedded!)
38
+ instance.__send__(:validate_belongs_tos!)
33
39
  instance
34
40
  end
35
41
 
@@ -68,6 +74,7 @@ module Exwiw
68
74
  merged.filter = filter
69
75
  merged.bulk_insert_chunk_size = bulk_insert_chunk_size
70
76
  merged.ignore = ignore
77
+ merged.ignore_type = ignore_type
71
78
  # A freshly generated comment (e.g. the skip_unsupported marker) wins so
72
79
  # it stays accurate; otherwise a hand-added note on a normal collection
73
80
  # is kept.
@@ -84,6 +91,7 @@ module Exwiw
84
91
  if receiver_bt
85
92
  pbt.comment = receiver_bt.comment if receiver_bt.comment
86
93
  pbt.ignore = receiver_bt.ignore unless receiver_bt.ignore.nil?
94
+ pbt.ignore_type = receiver_bt.ignore_type if receiver_bt.ignore_type
87
95
  pbt.references = receiver_bt.references if receiver_bt.references
88
96
  end
89
97
  pbt
@@ -114,5 +122,18 @@ module Exwiw
114
122
  "belongs_tos must be empty (cross-collection refs from inside embedded arrays " \
115
123
  "are not supported)."
116
124
  end
125
+
126
+ # `table_name` is optional only so an *ignored* relation (a stale belongs_to
127
+ # whose target collection no longer exists) can be recorded without one. A
128
+ # belongs_to that still participates in extraction must name its target.
129
+ private def validate_belongs_tos!
130
+ offender = belongs_tos.find { |bt| bt.table_name.nil? && !bt.ignore }
131
+ return unless offender
132
+
133
+ raise ArgumentError,
134
+ "MongodbCollectionConfig '#{name}' has a belongs_to (foreign_key " \
135
+ "'#{offender.foreign_key}') with no table_name; only an `ignore: true` belongs_to " \
136
+ "may omit it."
137
+ end
117
138
  end
118
139
  end
@@ -49,7 +49,7 @@ module Exwiw
49
49
  end
50
50
 
51
51
  def generate!
52
- collections = build_collections
52
+ collections = build_collections(existing_configs_by_name(@output_dir))
53
53
  write_files(@output_dir, collections)
54
54
  collections
55
55
  end
@@ -61,11 +61,33 @@ module Exwiw
61
61
  # subclasses share the base's collection (Mongoid STI, discriminated by the
62
62
  # auto-added `_type` field) collapses into a single config that aggregates
63
63
  # every class's fields and associations. See `expand_with_descendants`.
64
- def build_collections
64
+ #
65
+ # `existing_by_name` maps a collection name to its config already on disk, so
66
+ # the build can honor an explicit `ignore: true` (collection- or
67
+ # belongs_to-level) without re-introspecting it — and thus without aborting
68
+ # on a construct the user has deliberately ignored. Empty (the default) when
69
+ # called directly without an output dir, in which case nothing is honored.
70
+ def build_collections(existing_by_name = {})
65
71
  models = expand_with_descendants(concrete_models)
66
72
  models
67
73
  .group_by { |model| model.collection_name.to_s }
68
- .map { |collection_name, group| build_collection_for(collection_name, group) }
74
+ .map { |collection_name, group| build_collection_for(collection_name, group, existing_by_name[collection_name]) }
75
+ end
76
+
77
+ # Loads the configs already on disk so the generator can honor an explicit
78
+ # `ignore: true` without re-introspecting (and thus without aborting on a
79
+ # construct the user has deliberately ignored). A file that cannot be read or
80
+ # parsed is skipped — a fresh run simply has none, and write_files surfaces
81
+ # genuine problems when it later merges/rewrites.
82
+ private def existing_configs_by_name(dir)
83
+ return {} unless dir && File.directory?(dir)
84
+
85
+ Dir[File.join(dir, "*.json")].each_with_object({}) do |path, acc|
86
+ config = MongodbCollectionConfig.from(JSON.parse(File.read(path)))
87
+ acc[config.name] = config
88
+ rescue JSON::ParserError, ArgumentError
89
+ next
90
+ end
69
91
  end
70
92
 
71
93
  def write_files(dir, collections)
@@ -88,7 +110,15 @@ module Exwiw
88
110
  # belongs_tos are unioned across the group; processing least-derived first
89
111
  # keeps the base's fields leading the list and the output deterministic
90
112
  # regardless of input order or sibling subclasses.
91
- private def build_collection_for(collection_name, models)
113
+ private def build_collection_for(collection_name, models, existing = nil)
114
+ # An explicit on-disk `ignore: true` means the user has triaged this
115
+ # collection and asked exwiw to leave it alone: preserve their config
116
+ # (ignore_type / comment intact) and skip introspection entirely, so a
117
+ # construct exwiw cannot represent never aborts a run the user has already
118
+ # accounted for. (A collection is never dumped while ignored, so its
119
+ # fields/structure need not track the model.)
120
+ return existing if existing&.ignore
121
+
92
122
  ordered = models.sort_by { |model| [model.fields.size, model.name] }
93
123
 
94
124
  attrs = {
@@ -128,7 +158,7 @@ module Exwiw
128
158
  attrs[:comment] = "exwiw could not derive embedded_in (#{reason}); marked ignore:true. Define this collection's embedded_in config by hand to dump/mask it."
129
159
  end
130
160
  else
131
- attrs[:belongs_tos] = aggregate_belongs_tos(ordered)
161
+ attrs[:belongs_tos] = aggregate_belongs_tos(ordered, existing)
132
162
  end
133
163
 
134
164
  MongodbCollectionConfig.from_symbol_keys(attrs)
@@ -200,7 +230,9 @@ module Exwiw
200
230
  end
201
231
  end
202
232
 
203
- private def aggregate_belongs_tos(models)
233
+ private def aggregate_belongs_tos(models, existing = nil)
234
+ ignored_by_fk = ignored_belongs_tos_by_foreign_key(existing)
235
+
204
236
  belongs_to_assocs = models.flat_map do |model|
205
237
  model.relations.values.select do |assoc|
206
238
  assoc.is_a?(::Mongoid::Association::Referenced::BelongsTo)
@@ -216,10 +248,22 @@ module Exwiw
216
248
  # same belongs_to twice, so uniq them.
217
249
  belongs_to_assocs
218
250
  .reject(&:polymorphic?)
219
- .filter_map { |assoc| belongs_to_for(assoc) }
251
+ .filter_map { |assoc| belongs_to_for(assoc, ignored_by_fk) }
220
252
  .uniq
221
253
  end
222
254
 
255
+ # Maps foreign_key -> the on-disk `ignore: true` belongs_to entry, so a
256
+ # relation the user has explicitly ignored is preserved verbatim instead of
257
+ # re-resolved (which, for a stale relation whose target class is gone, would
258
+ # otherwise abort the run).
259
+ private def ignored_belongs_tos_by_foreign_key(existing)
260
+ return {} unless existing
261
+
262
+ existing.belongs_tos.select(&:ignore).each_with_object({}) do |bt, acc|
263
+ acc[bt.foreign_key] = bt
264
+ end
265
+ end
266
+
223
267
  # Resolves a referenced belongs_to to a `{ table_name, foreign_key }` pair
224
268
  # (plus `references` when the FK points at a non-`_id` parent field).
225
269
  # `assoc.klass` raises NameError when the association's target class no longer
@@ -227,7 +271,18 @@ module Exwiw
227
271
  # ago). Under `skip_unsupported` such a relation is skipped with a warning —
228
272
  # its foreign-key column is still tracked as an ordinary field by
229
273
  # `aggregate_fields`, mirroring how polymorphic / HABTM relations are dropped.
230
- private def belongs_to_for(assoc)
274
+ #
275
+ # `ignored_by_fk` carries the on-disk `ignore: true` belongs_to entries: when
276
+ # this relation's foreign key is among them, the user has explicitly ignored
277
+ # it, so preserve their entry verbatim (its `ignore_type` / `comment`) without
278
+ # resolving the — possibly gone — target. The relation is dropped from
279
+ # extraction at load (`#reject_ignored_members!`) while its FK column stays a
280
+ # field, and the run never aborts on a relation already triaged.
281
+ private def belongs_to_for(assoc, ignored_by_fk = {})
282
+ if (ignored = ignored_by_fk[assoc.foreign_key])
283
+ return preserve_ignored_belongs_to(ignored)
284
+ end
285
+
231
286
  result = { table_name: assoc.klass.collection_name.to_s, foreign_key: assoc.foreign_key }
232
287
  # Mongoid's `belongs_to ..., primary_key: :uuid` makes the child's foreign
233
288
  # key reference that parent field rather than the parent's `_id`. Surface
@@ -245,6 +300,21 @@ module Exwiw
245
300
  nil
246
301
  end
247
302
 
303
+ # Re-emits a user's on-disk ignored belongs_to as a symbol-keyed hash (the
304
+ # shape `build_collection_for` feeds to `from_symbol_keys`), carrying its
305
+ # `ignore` / `ignore_type` / `comment` (and `table_name` / `references` when
306
+ # present) so the annotation survives regeneration untouched.
307
+ private def preserve_ignored_belongs_to(bt)
308
+ {
309
+ table_name: bt.table_name,
310
+ foreign_key: bt.foreign_key,
311
+ references: bt.references,
312
+ ignore: true,
313
+ ignore_type: bt.ignore_type,
314
+ comment: bt.comment,
315
+ }.compact
316
+ end
317
+
248
318
  # Resolves the `embedded_in` config for an embedded model. Each embedded
249
319
  # model points at its *immediate* embedding parent: the parent's collection
250
320
  # name plus the single document key (`store_as`, defaulting to the relation
@@ -317,31 +387,17 @@ module Exwiw
317
387
  )
318
388
  end
319
389
 
320
- # `store_as` defaults to the relation name and is the actual document key
321
- # the subdocuments are stored under inside the immediate parent.
322
- parent_relation =
323
- begin
324
- parent.relations[assoc.inverse.to_s]
325
- rescue ::Mongoid::Errors::MongoidError, NameError => e
326
- # e.g. AmbiguousRelationship: the embedded class is embedded under
327
- # several document keys in the parent (or otherwise has no single
328
- # resolvable inverse), so exwiw cannot pick the one path it lives under.
329
- raise UnsupportedEmbedding.new(
330
- "MongoidSchemaGenerator: '#{model.name}' (collection '#{model.collection_name}') " \
331
- "declares `embedded_in :#{assoc.name}` whose inverse on '#{parent.name}' is ambiguous " \
332
- "or unresolvable (#{e.class}: #{e.message.lines.first&.strip}). Add an `inverse_of:` to " \
333
- "disambiguate, or define the collection's config by hand.",
334
- reason: "has an embedded_in :#{assoc.name} with an ambiguous/unresolvable inverse",
335
- )
336
- end
390
+ # Resolve the document key (`store_as`, defaulting to the relation name)
391
+ # the subdocuments live under inside the parent.
392
+ parent_relation = embedding_relation_in(parent, assoc, model)
337
393
 
338
394
  unless parent_relation
339
- # `assoc.inverse` resolved to a name that is not an association on the
340
- # parent (or to nothing), so there is no document key to embed under.
395
+ # No embeds_one / embeds_many on the parent stores this collection, so
396
+ # there is no document key to embed under.
341
397
  raise UnsupportedEmbedding.new(
342
398
  "MongoidSchemaGenerator: '#{model.name}' (collection '#{model.collection_name}') " \
343
- "declares `embedded_in :#{assoc.name}` but its inverse relation could not be located on " \
344
- "'#{parent.name}' (the embedding document key is indeterminable). Add an `inverse_of:`, or " \
399
+ "declares `embedded_in :#{assoc.name}` but no embeds_one/embeds_many on '#{parent.name}' " \
400
+ "stores this collection (the embedding document key is indeterminable). Add an `inverse_of:`, or " \
345
401
  "define the collection's config by hand.",
346
402
  reason: "has an embedded_in :#{assoc.name} whose inverse relation could not be located",
347
403
  )
@@ -350,6 +406,64 @@ module Exwiw
350
406
  { collection_name: parent.collection_name.to_s, path: parent_relation.store_as }
351
407
  end
352
408
 
409
+ # Locates the parent's `embeds_one` / `embeds_many` association that stores
410
+ # this embedded collection — i.e. the document key the subdocuments live
411
+ # under. Mongoid's computed `assoc.inverse` is preferred when it resolves
412
+ # cleanly, but it is frequently `nil` (no explicit `inverse_of:` and Mongoid
413
+ # declines to infer one) or raises `AmbiguousRelationship`; in those cases
414
+ # fall back to matching the parent's embedding relations by the collection
415
+ # they store. This resolves the common single-embedding case that
416
+ # `assoc.inverse` cannot (e.g. an `embeds_one :force_logout` / `embedded_in
417
+ # :customer` pair with no inverse_of). Returns the relation, `nil` when none
418
+ # stores this collection, and raises `UnsupportedEmbedding` when several
419
+ # distinct keys do (genuinely ambiguous — exwiw cannot pick one).
420
+ private def embedding_relation_in(parent, assoc, model)
421
+ inverse_name =
422
+ begin
423
+ assoc.inverse
424
+ rescue ::Mongoid::Errors::MongoidError, NameError
425
+ nil
426
+ end
427
+
428
+ if inverse_name
429
+ rel = parent.relations[inverse_name.to_s]
430
+ return rel if rel
431
+ end
432
+
433
+ candidates = parent.relations.values.select do |rel|
434
+ (rel.is_a?(::Mongoid::Association::Embedded::EmbedsMany) ||
435
+ rel.is_a?(::Mongoid::Association::Embedded::EmbedsOne)) &&
436
+ embeds_collection?(rel, model)
437
+ end
438
+ paths = candidates.map(&:store_as).uniq
439
+
440
+ if paths.size > 1
441
+ # The same collection is embedded under several document keys in the
442
+ # parent, so `embedded_in :#{assoc.name}` has no single resolvable path.
443
+ raise UnsupportedEmbedding.new(
444
+ "MongoidSchemaGenerator: '#{model.name}' (collection '#{model.collection_name}') " \
445
+ "is embedded under multiple document keys (#{paths.join(', ')}) in '#{parent.name}', so its " \
446
+ "`embedded_in :#{assoc.name}` is ambiguous or unresolvable — exwiw cannot pick the single path " \
447
+ "it lives under. Add an `inverse_of:` to disambiguate, or define the collection's config by hand.",
448
+ reason: "has an embedded_in :#{assoc.name} with an ambiguous/unresolvable inverse",
449
+ )
450
+ end
451
+
452
+ candidates.first
453
+ end
454
+
455
+ # True when `rel` (an embeds_one / embeds_many on the parent) stores the same
456
+ # collection as `model`. Comparing collection names (rather than class
457
+ # identity) also matches an STI subclass embedded through a relation declared
458
+ # against its base class, since both share the base's collection. A sibling
459
+ # embedding relation whose target class no longer resolves is treated as a
460
+ # non-match rather than blowing up the whole derivation.
461
+ private def embeds_collection?(rel, model)
462
+ rel.klass.collection_name.to_s == model.collection_name.to_s
463
+ rescue NameError, ::Mongoid::Errors::MongoidError
464
+ false
465
+ end
466
+
353
467
  private def embedded_in_association(model)
354
468
  model.relations.values.find do |assoc|
355
469
  assoc.is_a?(::Mongoid::Association::Embedded::EmbeddedIn)
data/lib/exwiw/runner.rb CHANGED
@@ -7,7 +7,7 @@ module Exwiw
7
7
  def initialize(
8
8
  connection_config:,
9
9
  output_dir:,
10
- config_dir:,
10
+ schema_dir:,
11
11
  dump_target:,
12
12
  logger:,
13
13
  output_format: 'insert',
@@ -17,7 +17,7 @@ module Exwiw
17
17
  )
18
18
  @connection_config = connection_config
19
19
  @output_dir = output_dir
20
- @config_dir = config_dir
20
+ @schema_dir = schema_dir
21
21
  @dump_target = dump_target
22
22
  @output_format = output_format
23
23
  @insert_only = insert_only
@@ -159,7 +159,7 @@ module Exwiw
159
159
  end
160
160
 
161
161
  private def load_table_config(klass)
162
- Dir[File.join(@config_dir, "*.json")].map do |file|
162
+ Dir[File.join(@schema_dir, "*.json")].map do |file|
163
163
  json = JSON.parse(File.read(file))
164
164
  # Drop belongs_tos/columns(fields) flagged ignore:true so they are not
165
165
  # considered during extraction. Done here (after loading from file)
data/lib/exwiw/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Exwiw
4
- VERSION = "0.4.10"
4
+ VERSION = "0.5.0"
5
5
  end
data/lib/tasks/exwiw.rake CHANGED
@@ -7,7 +7,7 @@ namespace :exwiw do
7
7
  require "exwiw"
8
8
 
9
9
  Exwiw::SchemaGenerator.from_rails_application(
10
- output_dir: ENV["OUTPUT_DIR_PATH"] || "exwiw",
10
+ output_dir: ENV["EXWIW_SCHEMA_DIR_PATH"] || "exwiw/schema",
11
11
  ).generate!
12
12
  end
13
13
 
@@ -16,7 +16,7 @@ namespace :exwiw do
16
16
  require "exwiw"
17
17
 
18
18
  result = Exwiw::SchemaGenerator.from_rails_application(
19
- output_dir: ENV["OUTPUT_DIR_PATH"] || "exwiw",
19
+ output_dir: ENV["EXWIW_SCHEMA_DIR_PATH"] || "exwiw/schema",
20
20
  ).tidy!
21
21
 
22
22
  if result.empty?
@@ -32,16 +32,22 @@ namespace :exwiw do
32
32
  end
33
33
 
34
34
  desc "Generate schema from a Mongoid application"
35
- # Set EXWIW_SKIP_UNSUPPORTED=1 to keep generation going past constructs exwiw
36
- # cannot represent (an unresolvable `belongs_to`, or a polymorphic / cyclic /
37
- # unresolvable `embedded_in`): the unresolvable belongs_to is skipped and an
35
+ # Fail-loud by default: the task aborts on a construct exwiw cannot represent
36
+ # (an unresolvable `belongs_to`, or a polymorphic / cyclic / ambiguous /
37
+ # unresolvable `embedded_in`). To keep a deliberately-unrepresentable
38
+ # collection or relation from aborting the run, mark it `ignore: true` in its
39
+ # config on disk (optionally with an `ignore_type` / `comment` recording why);
40
+ # the generator honors that and skips re-introspecting it. Set
41
+ # EXWIW_SKIP_UNSUPPORTED=1 to additionally keep going past *un-annotated*
42
+ # unrepresentable constructs (the unresolvable belongs_to is skipped and an
38
43
  # unrepresentable embedded collection is emitted as `ignore: true` with a
39
- # `comment`, each warned to stderr, instead of aborting the whole run.
44
+ # `comment`, each warned to stderr) useful for the first bootstrap pass
45
+ # against a large app before the ignores are written.
40
46
  task generate_mongoid: :environment do
41
47
  require "exwiw"
42
48
 
43
49
  Exwiw::MongoidSchemaGenerator.from_rails_application(
44
- output_dir: ENV["OUTPUT_DIR_PATH"] || "exwiw",
50
+ output_dir: ENV["EXWIW_SCHEMA_DIR_PATH"] || "exwiw/schema",
45
51
  skip_unsupported: ENV["EXWIW_SKIP_UNSUPPORTED"] == "1",
46
52
  ).generate!
47
53
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: exwiw
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.4.10
4
+ version: 0.5.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Shia