exwiw 0.4.10 → 0.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +22 -0
- data/README.md +80 -14
- data/lib/exwiw/after_insert_hook.rb +1 -1
- data/lib/exwiw/belongs_to.rb +10 -1
- data/lib/exwiw/cli.rb +133 -13
- data/lib/exwiw/explain_runner.rb +3 -3
- data/lib/exwiw/mongodb_collection_config.rb +21 -0
- data/lib/exwiw/mongoid_schema_generator.rb +143 -29
- data/lib/exwiw/runner.rb +3 -3
- data/lib/exwiw/version.rb +1 -1
- data/lib/tasks/exwiw.rake +13 -7
- metadata +1 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 224bdc1d3b0f94e08463ad9e42a6e67d0592d902388b5873f5840226dbdbd3fe
|
|
4
|
+
data.tar.gz: de9ddd4a625565e0bcd28ff3f74df8da06092c443ad1f170d41c5858a24c4802
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: '08f564c07c09561a4b9b825bb7f6ca43a076df5b8262f165addad471639084fa5b5074330215edd36951ce0c427f510533f39d03e0632c1306ba4e9054391b33'
|
|
7
|
+
data.tar.gz: 2c161f236a676a15774fb097a7a2c4d66f95f38be8c465dfc29788b4c45165aa3b13dee2b1da771e317672e63f089d2496e10cf4fe2ebf16f97885e0e1c49c76
|
data/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,28 @@
|
|
|
2
2
|
|
|
3
3
|
## [Unreleased]
|
|
4
4
|
|
|
5
|
+
## [0.5.0] - 2026-06-16
|
|
6
|
+
|
|
7
|
+
### Added
|
|
8
|
+
|
|
9
|
+
- A YAML **config file** (`exwiw.yml`) can now hold any option except the database connection settings, so they no longer have to be repeated on every invocation. Pass it with `--config=PATH`; when `--config` is omitted, `exwiw.yml` (or `exwiw.yaml`) is loaded automatically from the current directory if present. **Options passed on the CLI take precedence** over the file (the file only fills in options not given on the CLI). Connection settings — `host`, `port`, `user`, `database`, `uri`, `password` — are **rejected** in the file (they must come from the CLI/environment); `adapter` is the one connection-related key allowed. Relative paths in the file (`schema_dir`, `output_dir`, `after_insert_hook`) are resolved relative to the config file's own directory (so a root-level `exwiw.yml` with `schema_dir: exwiw/schema` reads naturally and an absolute `--config` works from any directory). Unknown keys are rejected to catch typos, and export-only keys (`output_dir`, `output_format`, `insert_only`, `after_insert_hook`) are ignored under `explain` so one file can be shared by both subcommands.
|
|
10
|
+
|
|
11
|
+
### Changed
|
|
12
|
+
|
|
13
|
+
- **BREAKING**: the `export`/`explain` CLI option `--config-dir` has been renamed to `--schema-dir` to distinguish the directory of schema JSON files from the new `--config` config file. Its short form `-c` is now `--config` (the config file); `--schema-dir` has no short form. The hook contract is renamed to match: the shell-hook environment variable `EXWIW_CONFIG_DIR` is now `EXWIW_SCHEMA_DIR`, and the Ruby-hook `cli_options[:config_dir]` is now `cli_options[:schema_dir]`. Update invocations, scripts, and hooks accordingly (`--config-dir` no longer exists). `--schema-dir` is still required and has no default unless `schema_dir` is set in the config file.
|
|
14
|
+
- **BREAKING**: the env var that overrides where `schema:generate`, `schema:tidy`, and `schema:generate_mongoid` write their config has been renamed from `OUTPUT_DIR_PATH` to `EXWIW_SCHEMA_DIR_PATH`, and the default output directory is now `exwiw/schema` (previously `exwiw`). The new name disambiguates it from the dump-side `--output-dir`, and the dedicated `schema/` subdirectory leaves `exwiw/` free for other artifacts (hooks, dumps). `OUTPUT_DIR_PATH` is no longer read. Existing repositories should set `EXWIW_SCHEMA_DIR_PATH` (e.g. `EXWIW_SCHEMA_DIR_PATH=exwiw` to preserve the old flat layout) and/or move their config under `exwiw/schema/`; otherwise a `generate` run will write a fresh copy into `exwiw/schema/` and leave the old files stale. The `export`/`explain` CLI is unaffected, but examples now point at `exwiw/schema`.
|
|
15
|
+
|
|
16
|
+
## [0.4.11] - 2026-06-15
|
|
17
|
+
|
|
18
|
+
### Fixed
|
|
19
|
+
|
|
20
|
+
- MongoDB: `schema:generate_mongoid` now resolves an embedded collection's `embedded_in` document key by locating the parent's `embeds_one` / `embeds_many` that stores the collection, instead of trusting Mongoid's computed `assoc.inverse`. Mongoid returns a `nil` inverse for many valid embeddings (when no explicit `inverse_of:` is declared and it declines to infer one), and previously such a model was wrongly reported as having an "unresolvable inverse" and skipped (or aborted the run). Matching by the embedded collection also resolves an STI subclass embedded through a relation declared against its base class. Genuinely ambiguous embeddings (the same collection stored under several keys in the parent) are still reported as unrepresentable.
|
|
21
|
+
|
|
22
|
+
### Added
|
|
23
|
+
|
|
24
|
+
- MongoDB: `schema:generate_mongoid` now **honors an explicit `ignore: true` on disk** and skips re-introspecting it, so a construct exwiw cannot represent that you have already triaged no longer aborts the (fail-loud, default) run — and the annotation survives regeneration. Works at two granularities: a whole **collection** marked `ignore: true` is preserved as-is without introspection, and a single **`belongs_to`** marked `ignore: true` (no `table_name` required) is preserved while the rest of its collection still generates and dumps (its foreign-key column stays an ordinary field). This lets you keep the generator strict by default while opting individual stale/unrepresentable constructs out by hand, rather than relying on `EXWIW_SKIP_UNSUPPORTED`.
|
|
25
|
+
- `MongodbCollectionConfig` and `BelongsTo` gain an optional, user-owned **`ignore_type`** tag (free-form; exwiw never interprets or emits it) to record *why* something is ignored — e.g. `"need_code_fix"` for an application-side bug, `"unsupported"` for a shape exwiw cannot express. Preserved across regeneration like `comment`. `BelongsTo#table_name` is now optional so an ignored, no-longer-resolvable relation can be recorded without a target collection (a non-ignored `belongs_to` still requires one).
|
|
26
|
+
|
|
5
27
|
## [0.4.10] - 2026-06-12
|
|
6
28
|
|
|
7
29
|
### Fixed
|
data/README.md
CHANGED
|
@@ -72,7 +72,7 @@ exwiw \
|
|
|
72
72
|
--port=3306 \
|
|
73
73
|
--user=reader \
|
|
74
74
|
--database=app_production \
|
|
75
|
-
--
|
|
75
|
+
--schema-dir=exwiw/schema \
|
|
76
76
|
--target-table=shops \
|
|
77
77
|
--ids=1 \ # comma separated ids
|
|
78
78
|
--output-dir=dump \
|
|
@@ -81,7 +81,7 @@ exwiw \
|
|
|
81
81
|
|
|
82
82
|
By default `--ids` are matched against the target table's primary key. `--ids-column=COLUMN` matches them against a different column instead (e.g. `--target-table=users --ids=alice@example.com --ids-column=email`). Related tables are still extracted correctly: their foreign keys are resolved through the target via a subquery (`WHERE fk IN (SELECT pk FROM target WHERE COLUMN IN (...))`), so only the target table's filter column changes. This is the SQL-adapter counterpart of the mongodb `--ids-field`; the two are mutually exclusive and each is rejected by the other adapter family. Note: if `COLUMN` is itself masked, re-running `delete-*` against an already-imported (masked) dump won't match, so prefer a stable natural key.
|
|
83
83
|
|
|
84
|
-
When `--target-table` and `--ids` are omitted, exwiw dumps all tables defined in `--
|
|
84
|
+
When `--target-table` and `--ids` are omitted, exwiw dumps all tables defined in `--schema-dir`:
|
|
85
85
|
|
|
86
86
|
```bash
|
|
87
87
|
# dump all tables
|
|
@@ -91,7 +91,7 @@ exwiw \
|
|
|
91
91
|
--port=5432 \
|
|
92
92
|
--user=reader \
|
|
93
93
|
--database=app_production \
|
|
94
|
-
--
|
|
94
|
+
--schema-dir=exwiw/schema \
|
|
95
95
|
--output-dir=dump
|
|
96
96
|
```
|
|
97
97
|
|
|
@@ -123,25 +123,58 @@ exwiw explain \
|
|
|
123
123
|
--adapter=postgresql \
|
|
124
124
|
--host=localhost --port=5432 --user=reader \
|
|
125
125
|
--database=app_production \
|
|
126
|
-
--
|
|
126
|
+
--schema-dir=exwiw/schema \
|
|
127
127
|
--target-table=shops --ids=1
|
|
128
128
|
```
|
|
129
129
|
|
|
130
130
|
The `--output-dir`, `--output-format`, `--insert-only`, and `--after-insert-hook` options are dump-specific and rejected when used with `explain`.
|
|
131
131
|
|
|
132
|
+
### Config file (`exwiw.yml`)
|
|
133
|
+
|
|
134
|
+
Options you would otherwise repeat on every run can be kept in a YAML config file. Pass it with `--config=PATH`; when `--config` is omitted, exwiw automatically loads `exwiw.yml` (or `exwiw.yaml`) from the current directory if present.
|
|
135
|
+
|
|
136
|
+
**Options passed on the CLI always take precedence over the config file** — the config only fills in options you did not pass. This lets you commit the stable settings (which schema to read, output format, ...) while still varying the environment-specific connection details per invocation.
|
|
137
|
+
|
|
138
|
+
```yaml
|
|
139
|
+
# exwiw.yml — keep at the project root, alongside exwiw/schema/
|
|
140
|
+
adapter: postgresql
|
|
141
|
+
schema_dir: exwiw/schema
|
|
142
|
+
output_dir: dump
|
|
143
|
+
output_format: insert # insert | copy
|
|
144
|
+
insert_only: false
|
|
145
|
+
after_insert_hook: hooks/seed.rb
|
|
146
|
+
log_level: info # debug | info
|
|
147
|
+
# target_table / ids / ids_field / ids_column may also be set here
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
With the file above, only the connection details need to be supplied on the CLI:
|
|
151
|
+
|
|
152
|
+
```bash
|
|
153
|
+
DATABASE_PASSWORD=... exwiw \
|
|
154
|
+
--host=localhost --port=5432 --user=reader --database=app_production \
|
|
155
|
+
--target-table=shops --ids=1
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
Notes:
|
|
159
|
+
|
|
160
|
+
- **Database connection settings stay on the CLI/environment.** `host`, `port`, `user`, `database`, `uri`, and `password` are **rejected** in the config file (exwiw exits with an error). `adapter` is the one connection-related key that *is* allowed in the file.
|
|
161
|
+
- **Relative paths in the config (`schema_dir`, `output_dir`, `after_insert_hook`) are resolved relative to the config file's own directory**, not the current working directory. So with the config at the project root, `schema_dir: exwiw/schema` reads naturally, and an absolute `--config=/path/to/exwiw.yml` works no matter where you run from. (CLI path flags remain relative to the current directory — each source resolves relative to where it is written.) Absolute paths are used as-is.
|
|
162
|
+
- Unknown keys are rejected so a typo surfaces immediately.
|
|
163
|
+
- Export-only keys (`output_dir`, `output_format`, `insert_only`, `after_insert_hook`) are ignored when running `explain`, so a single config file can be shared by both subcommands.
|
|
164
|
+
|
|
132
165
|
### Generator
|
|
133
166
|
|
|
134
167
|
The config generator is provided as a Rake task.
|
|
135
168
|
|
|
136
169
|
```bash
|
|
137
|
-
# generate table schema under exwiw/
|
|
170
|
+
# generate table schema under exwiw/schema/
|
|
138
171
|
bundle exec rake exwiw:schema:generate
|
|
139
172
|
```
|
|
140
173
|
|
|
141
|
-
By default, the schema files will be saved in the `exwiw` directory. You can specify a different output directory by setting the `
|
|
174
|
+
By default, the schema files will be saved in the `exwiw/schema` directory. You can specify a different output directory by setting the `EXWIW_SCHEMA_DIR_PATH` environment variable:
|
|
142
175
|
|
|
143
176
|
```sh
|
|
144
|
-
|
|
177
|
+
EXWIW_SCHEMA_DIR_PATH=custom_directory bundle exec rake exwiw:schema:generate
|
|
145
178
|
```
|
|
146
179
|
|
|
147
180
|
#### Tidying stale config (`schema:tidy`)
|
|
@@ -159,14 +192,14 @@ bundle exec rake exwiw:schema:tidy
|
|
|
159
192
|
|
|
160
193
|
Because it reads the database directly, a table that still exists in the database but has lost (or never had) an ActiveRecord model is **kept** — only a table that is genuinely gone is removed. (This is the deliberate counterpart to `generate`, which is model-driven and only ever adds what the models know about.)
|
|
161
194
|
|
|
162
|
-
It respects `
|
|
195
|
+
It respects `EXWIW_SCHEMA_DIR_PATH` and the per-database subdirectory layout in the same way as `schema:generate`. Unlike `generate`, `tidy` never adds or regenerates entries — every surviving table/column (including hand-edited `comment` / `ignore` / `replace_with`) is left untouched, so it is safe to run on a customized config. The task prints which tables and columns it removed (or that the config was already tidy). Stale `belongs_tos` are not pruned by `tidy`; rerun `schema:generate` to refresh those.
|
|
163
196
|
|
|
164
197
|
#### Multiple databases
|
|
165
198
|
|
|
166
199
|
If the application uses Rails' multiple-database support (`connects_to`), `schema:generate` buckets models by the database they connect to and writes each database's config files into its own subdirectory of the output directory, named after the database config name (`primary`, `analytics`, ...):
|
|
167
200
|
|
|
168
201
|
```
|
|
169
|
-
exwiw/
|
|
202
|
+
exwiw/schema/
|
|
170
203
|
primary/
|
|
171
204
|
shops.json
|
|
172
205
|
users.json
|
|
@@ -194,20 +227,53 @@ It is a distinct task and class (`Exwiw::MongoidSchemaGenerator`) from the Activ
|
|
|
194
227
|
- the collection name and the `_id` primary key,
|
|
195
228
|
- `fields` from the declared Mongoid fields (referenced `belongs_to` foreign keys such as `shop_id`, and the `created_at` / `updated_at` columns added by `Mongoid::Timestamps`, are ordinary fields — their BSON `ObjectId` / `Date` values serialize as MongoDB Extended JSON at dump time). For an aliased field (`field :ctry, as: :country`), the generator emits the **stored** document key (`ctry`), never the Ruby accessor (`country`), so masking and projection target the key that actually appears in the document, and additionally records the accessor as `mongoid_field_name` on that field so the short key stays understandable (association aliases such as `shop => shop_id` and the built-in `id => _id` are not field renames and are not annotated),
|
|
196
229
|
- `belongs_tos` from referenced `belongs_to` associations (`{ table_name, foreign_key }`). A referenced `belongs_to` declared on an *embedded* document is dropped (cross-collection refs from inside embedded subdocuments are unsupported — see [MongoDB notes](#mongodb-notes)), but its foreign-key column is still kept as an ordinary field. A `has_and_belongs_to_many` association is also dropped (its foreign keys are stored as an array field, e.g. `tag_ids`, which exwiw cannot follow as a single-valued foreign key), while that `*_ids` array column is kept as an ordinary field,
|
|
197
|
-
- `embedded_in` from `embedded_in` / `embeds_many` / `embeds_one` associations. Each embedded config names its *immediate* parent collection and the document key it lives under (`store_as`, defaulting to the relation name); nested embedding is represented as a chain (`comments` → `embedded_in` `posts`, `posts` → `embedded_in` `users`) rather than a flattened dot-path, matching how the adapter recurses through array and Hash subdocuments. A *polymorphic* `embedded_in` (`embedded_in :addressable, polymorphic: true`) has no single embedding parent collection and so cannot be expressed as an `embedded_in` config
|
|
230
|
+
- `embedded_in` from `embedded_in` / `embeds_many` / `embeds_one` associations. Each embedded config names its *immediate* parent collection and the document key it lives under (`store_as`, defaulting to the relation name); nested embedding is represented as a chain (`comments` → `embedded_in` `posts`, `posts` → `embedded_in` `users`) rather than a flattened dot-path, matching how the adapter recurses through array and Hash subdocuments. The document key is resolved by locating the parent's `embeds_one` / `embeds_many` that stores this collection. (Mongoid's computed inverse is frequently `nil` when no explicit `inverse_of:` is set, so exwiw matches by the collection the parent's embedding relations store rather than trusting that inverse — this also resolves an STI subclass embedded through a relation declared against its base class.) When the same collection is embedded under several keys in the parent, the path is ambiguous and treated as unrepresentable (see below). A *polymorphic* `embedded_in` (`embedded_in :addressable, polymorphic: true`) has no single embedding parent collection and so cannot be expressed as an `embedded_in` config. A *self-referential / cyclic* embedding (Mongoid's `recursively_embeds_many` / `recursively_embeds_one`) makes a collection both a top-level document and embedded inside documents of its own type; exwiw represents a collection as either top-level or embedded, not both, so it cannot emit an `embedded_in` config that would silently make the collection undumpable. These unrepresentable shapes are handled best-effort by default and abort only in strict mode (see below).
|
|
198
231
|
|
|
199
232
|
Models in an inheritance hierarchy whose subclasses share the base's collection (Mongoid STI, distinguished by the auto-added `_type` discriminator) collapse into a single config: the generator discovers the subclasses via `descendants` (Mongoid registers only the base class in `Mongoid.models`) and unions every class's `fields` and `belongs_tos` into the collection config, so subclass-only fields and associations are not lost.
|
|
200
233
|
|
|
201
234
|
Regeneration preserves hand-edited `replace_with`, `filter`, `ignore`, and `bulk_insert_chunk_size` values, like the ActiveRecord generator. Indexes are not written to the config — they are introspected from the live database at dump time (see [MongoDB notes](#mongodb-notes)). Polymorphic `belongs_to` is not yet expanded by this task.
|
|
202
235
|
|
|
203
|
-
By default the task **aborts** when a model uses a construct exwiw cannot represent: a `belongs_to` whose target class can no longer be resolved (a stale relation left behind after its model was removed), or a polymorphic / self-referential-cyclic / unresolvable-parent `embedded_in` (see the cases above).
|
|
236
|
+
By default the task **aborts** when a model uses a construct exwiw cannot represent: a `belongs_to` whose target class can no longer be resolved (a stale relation left behind after its model was removed), or a polymorphic / self-referential-cyclic / ambiguous / unresolvable-parent `embedded_in` (see the cases above).
|
|
237
|
+
|
|
238
|
+
#### Honoring an explicit `ignore` (the recommended way to keep these out)
|
|
239
|
+
|
|
240
|
+
When you have reviewed such a construct and decided exwiw should leave it alone, mark it `ignore: true` in its config on disk. The generator **honors an explicit `ignore` and skips re-introspecting it**, so it never aborts the run on something you have already triaged — and your annotation survives regeneration. Two granularities:
|
|
241
|
+
|
|
242
|
+
- A whole **collection** exwiw cannot represent (e.g. a polymorphic / ambiguous `embedded_in`) — mark the collection config `"ignore": true`. To actually dump/mask it later, define its `embedded_in` config by hand (see [Embedded documents](#embedded-documents)).
|
|
243
|
+
- A single **`belongs_to`** that no longer resolves while the rest of its collection is fine (e.g. a stale relation pointing at a removed model) — mark that entry `"ignore": true`, with no `table_name`. The relation is dropped from extraction (`#reject_ignored_members!`) while its foreign-key column stays an ordinary field, and the collection keeps dumping.
|
|
244
|
+
|
|
245
|
+
Record *why* with the optional **`ignore_type`** (a free-form tag exwiw never interprets — e.g. `"need_code_fix"` for an application-side bug, `"unsupported"` for a shape exwiw cannot express) and a **`comment`**. Both are user-owned and preserved across regeneration; the generator never emits `ignore_type` itself.
|
|
246
|
+
|
|
247
|
+
```json
|
|
248
|
+
// orders.json — a stale belongs_to flagged for a code fix; the collection still dumps
|
|
249
|
+
{
|
|
250
|
+
"name": "orders",
|
|
251
|
+
"primary_key": "_id",
|
|
252
|
+
"belongs_to": [
|
|
253
|
+
{ "table_name": "shops", "foreign_key": "shop_id" },
|
|
254
|
+
{
|
|
255
|
+
"foreign_key": "coupon_id",
|
|
256
|
+
"ignore": true,
|
|
257
|
+
"ignore_type": "need_code_fix",
|
|
258
|
+
"comment": "FIXME: belongs_to :coupon -> Coupon does not exist (dead relation)."
|
|
259
|
+
}
|
|
260
|
+
],
|
|
261
|
+
"fields": [ /* ... coupon_id is kept as an ordinary field ... */ ]
|
|
262
|
+
}
|
|
263
|
+
```
|
|
264
|
+
|
|
265
|
+
#### First bootstrap pass: `EXWIW_SKIP_UNSUPPORTED=1`
|
|
266
|
+
|
|
267
|
+
For the very first pass against a large app — before any `ignore` annotations exist — set `EXWIW_SKIP_UNSUPPORTED=1` to keep going past *un-annotated* unrepresentable constructs instead of aborting one at a time:
|
|
204
268
|
|
|
205
269
|
```bash
|
|
206
270
|
EXWIW_SKIP_UNSUPPORTED=1 bundle exec rake exwiw:schema:generate_mongoid
|
|
207
271
|
```
|
|
208
272
|
|
|
209
273
|
- An unresolvable `belongs_to` is dropped from the collection's `belongs_tos` (its foreign-key column is still kept as an ordinary field, like the polymorphic / HABTM cases) and a warning naming the relation is printed to stderr.
|
|
210
|
-
- An unrepresentable `embedded_in` collection is emitted as a **top-level** config marked `"ignore": true` with a `comment` recording why, and a warning is printed.
|
|
274
|
+
- An unrepresentable `embedded_in` collection is emitted as a **top-level** config marked `"ignore": true` with a `comment` recording why, and a warning is printed.
|
|
275
|
+
|
|
276
|
+
Review the stderr warnings, annotate the affected configs (`ignore` / `ignore_type` / `comment`), and subsequent runs complete without the flag because the generator honors those explicit ignores.
|
|
211
277
|
|
|
212
278
|
### Configuration
|
|
213
279
|
|
|
@@ -234,7 +300,7 @@ This is an example of the one table schema:
|
|
|
234
300
|
}
|
|
235
301
|
```
|
|
236
302
|
|
|
237
|
-
`--
|
|
303
|
+
`--schema-dir` will use all json files in the specified directory.
|
|
238
304
|
|
|
239
305
|
### Output format
|
|
240
306
|
|
|
@@ -274,7 +340,7 @@ SQL
|
|
|
274
340
|
|
|
275
341
|
**Shell hook**: anything other than `.rb` is exec'd as a child process. It is a pure side-effect hook — exwiw does not capture its stdout. The hook receives these env vars and inherits `DATABASE_PASSWORD` from the parent:
|
|
276
342
|
|
|
277
|
-
- `EXWIW_OUTPUT_DIR`, `
|
|
343
|
+
- `EXWIW_OUTPUT_DIR`, `EXWIW_SCHEMA_DIR`
|
|
278
344
|
- `EXWIW_DATABASE_ADAPTER`, `EXWIW_DATABASE_HOST`, `EXWIW_DATABASE_PORT`, `EXWIW_DATABASE_USER`, `EXWIW_DATABASE_NAME`
|
|
279
345
|
- `EXWIW_TARGET_TABLE`, `EXWIW_IDS` (comma-separated), `EXWIW_OUTPUT_FORMAT`
|
|
280
346
|
|
|
@@ -31,7 +31,7 @@ module Exwiw
|
|
|
31
31
|
def self.run_shell(path:, cli_options:, output_dir:, logger:)
|
|
32
32
|
env = {
|
|
33
33
|
'EXWIW_OUTPUT_DIR' => output_dir,
|
|
34
|
-
'
|
|
34
|
+
'EXWIW_SCHEMA_DIR' => cli_options[:schema_dir].to_s,
|
|
35
35
|
'EXWIW_DATABASE_ADAPTER' => cli_options[:database_adapter].to_s,
|
|
36
36
|
'EXWIW_DATABASE_HOST' => cli_options[:database_host].to_s,
|
|
37
37
|
'EXWIW_DATABASE_PORT' => cli_options[:database_port].to_s,
|
data/lib/exwiw/belongs_to.rb
CHANGED
|
@@ -5,7 +5,11 @@ module Exwiw
|
|
|
5
5
|
include Serdes
|
|
6
6
|
|
|
7
7
|
attribute :foreign_key, String
|
|
8
|
-
|
|
8
|
+
# Optional so an ignored, no-longer-resolvable relation (a stale
|
|
9
|
+
# `belongs_to` whose target class is gone) can be recorded with no target
|
|
10
|
+
# collection. A non-ignored belongs_to still requires it — enforced by the
|
|
11
|
+
# owning config's validation (e.g. MongodbCollectionConfig#validate_belongs_tos!).
|
|
12
|
+
attribute :table_name, optional(String), skip_serializing_if_nil: true
|
|
9
13
|
# Set only for a polymorphic association. `foreign_type` is the name of the
|
|
10
14
|
# column storing the type (e.g. `reviewable_type`), and `type_value` is the
|
|
11
15
|
# value held in that column (e.g. `"Product"`). Both are nil for a
|
|
@@ -32,6 +36,11 @@ module Exwiw
|
|
|
32
36
|
# extraction once the config is loaded (see #reject_ignored_members!).
|
|
33
37
|
attribute :comment, optional(String), skip_serializing_if_nil: true
|
|
34
38
|
attribute :ignore, Serdes::OptionalType.new(Serdes::ConcreteType.new(Boolean)), skip_serializing_if_nil: true
|
|
39
|
+
# Free-form tag recording *why* this relation is ignored (e.g.
|
|
40
|
+
# "need_code_fix" for an application-side bug, "unsupported" for a shape
|
|
41
|
+
# exwiw cannot express). exwiw never interprets or emits it; purely
|
|
42
|
+
# informational and preserved across regeneration like `comment`.
|
|
43
|
+
attribute :ignore_type, optional(String), skip_serializing_if_nil: true
|
|
35
44
|
|
|
36
45
|
def self.from_symbol_keys(hash)
|
|
37
46
|
from(hash.transform_keys(&:to_s))
|
data/lib/exwiw/cli.rb
CHANGED
|
@@ -5,6 +5,7 @@ require 'optparse'
|
|
|
5
5
|
require 'pathname'
|
|
6
6
|
|
|
7
7
|
require 'json'
|
|
8
|
+
require 'yaml'
|
|
8
9
|
|
|
9
10
|
require 'exwiw'
|
|
10
11
|
|
|
@@ -12,6 +13,39 @@ module Exwiw
|
|
|
12
13
|
class CLI
|
|
13
14
|
KNOWN_SUBCOMMANDS = %w[export explain].freeze
|
|
14
15
|
|
|
16
|
+
# Config file loaded automatically when --config is omitted, if one exists in
|
|
17
|
+
# the current directory. Kept at the project root (rather than under exwiw/)
|
|
18
|
+
# so that config-relative paths like `schema_dir: exwiw/schema` read naturally.
|
|
19
|
+
# Both extensions are accepted; .yml wins when both are present.
|
|
20
|
+
DEFAULT_CONFIG_PATHS = %w[exwiw.yml exwiw.yaml].freeze
|
|
21
|
+
|
|
22
|
+
# Keys accepted in the config file. Anything outside this set is rejected so
|
|
23
|
+
# a typo surfaces immediately instead of being silently ignored. These mirror
|
|
24
|
+
# the non-connection CLI options (plus `adapter`).
|
|
25
|
+
ALLOWED_CONFIG_KEYS = %w[
|
|
26
|
+
adapter
|
|
27
|
+
schema_dir
|
|
28
|
+
output_dir
|
|
29
|
+
output_format
|
|
30
|
+
insert_only
|
|
31
|
+
after_insert_hook
|
|
32
|
+
log_level
|
|
33
|
+
target_table
|
|
34
|
+
target_collection
|
|
35
|
+
ids
|
|
36
|
+
ids_field
|
|
37
|
+
ids_column
|
|
38
|
+
].freeze
|
|
39
|
+
|
|
40
|
+
# Database connection settings are environment-specific (and sometimes
|
|
41
|
+
# secret-adjacent), so they must be passed via CLI/env, never the committed
|
|
42
|
+
# config file. `adapter` is the one connection-ish key allowed in config.
|
|
43
|
+
REJECTED_CONNECTION_KEYS = %w[host port user database uri password].freeze
|
|
44
|
+
|
|
45
|
+
# Keys that only make sense for `export`. They are skipped when merging config
|
|
46
|
+
# for `explain` so a shared config file does not trip validate_explain_only!.
|
|
47
|
+
EXPORT_ONLY_CONFIG_KEYS = %w[output_dir output_format insert_only after_insert_hook].freeze
|
|
48
|
+
|
|
15
49
|
def self.start(argv)
|
|
16
50
|
new(argv).run
|
|
17
51
|
end
|
|
@@ -34,7 +68,8 @@ module Exwiw
|
|
|
34
68
|
@database_password = ENV["DATABASE_PASSWORD"]
|
|
35
69
|
@connection_uri = nil
|
|
36
70
|
@output_dir = nil
|
|
37
|
-
@
|
|
71
|
+
@schema_dir = nil
|
|
72
|
+
@config_file_path = nil
|
|
38
73
|
@database_adapter = nil
|
|
39
74
|
@database_name = nil
|
|
40
75
|
@target_table_name = nil
|
|
@@ -45,7 +80,9 @@ module Exwiw
|
|
|
45
80
|
@output_format = nil
|
|
46
81
|
@insert_only = nil
|
|
47
82
|
@after_insert_hook_path = nil
|
|
48
|
-
|
|
83
|
+
# nil (not :info) so we can tell "user passed --log-level" from the default,
|
|
84
|
+
# letting a config-file value fill in; the :info default is applied later.
|
|
85
|
+
@log_level = nil
|
|
49
86
|
|
|
50
87
|
parser.parse!(@argv)
|
|
51
88
|
end
|
|
@@ -82,7 +119,7 @@ module Exwiw
|
|
|
82
119
|
Runner.new(
|
|
83
120
|
connection_config: connection_config,
|
|
84
121
|
output_dir: @output_dir,
|
|
85
|
-
|
|
122
|
+
schema_dir: @schema_dir,
|
|
86
123
|
dump_target: dump_target,
|
|
87
124
|
output_format: @output_format,
|
|
88
125
|
insert_only: @insert_only,
|
|
@@ -93,7 +130,7 @@ module Exwiw
|
|
|
93
130
|
when "explain"
|
|
94
131
|
ExplainRunner.new(
|
|
95
132
|
connection_config: connection_config,
|
|
96
|
-
|
|
133
|
+
schema_dir: @schema_dir,
|
|
97
134
|
dump_target: dump_target,
|
|
98
135
|
logger: logger,
|
|
99
136
|
io: $stdout,
|
|
@@ -102,6 +139,14 @@ module Exwiw
|
|
|
102
139
|
end
|
|
103
140
|
|
|
104
141
|
private def validate_options!
|
|
142
|
+
# Fill in any options not given on the CLI from the config file. Done first
|
|
143
|
+
# so a config-provided `adapter` is in place before normalization below.
|
|
144
|
+
# CLI values always win (the merge only fills nil/empty ivars).
|
|
145
|
+
apply_config_file!
|
|
146
|
+
|
|
147
|
+
# Default log level once CLI and config have both had their say.
|
|
148
|
+
@log_level ||= :info
|
|
149
|
+
|
|
105
150
|
# Fold driver/Rails adapter spellings (mysql2, sqlite3) into exwiw's
|
|
106
151
|
# canonical names up front, so every check below — and the
|
|
107
152
|
# EXWIW_DATABASE_ADAPTER passed to hooks — sees the canonical name.
|
|
@@ -163,18 +208,18 @@ module Exwiw
|
|
|
163
208
|
end
|
|
164
209
|
end
|
|
165
210
|
|
|
166
|
-
if @
|
|
167
|
-
$stderr.puts "
|
|
211
|
+
if @schema_dir.nil?
|
|
212
|
+
$stderr.puts "Schema dir is required (pass --schema-dir or set schema_dir in the config file)"
|
|
168
213
|
exit 1
|
|
169
214
|
end
|
|
170
215
|
|
|
171
|
-
unless Dir.exist?(@
|
|
172
|
-
$stderr.puts "
|
|
216
|
+
unless Dir.exist?(@schema_dir)
|
|
217
|
+
$stderr.puts "Schema dir does not exist: #{@schema_dir}"
|
|
173
218
|
exit 1
|
|
174
219
|
end
|
|
175
220
|
|
|
176
|
-
if Dir.glob(File.join(@
|
|
177
|
-
$stderr.puts "
|
|
221
|
+
if Dir.glob(File.join(@schema_dir, "*.json")).empty?
|
|
222
|
+
$stderr.puts "Schema dir contains no .json files: #{@schema_dir}"
|
|
178
223
|
exit 1
|
|
179
224
|
end
|
|
180
225
|
|
|
@@ -202,6 +247,78 @@ module Exwiw
|
|
|
202
247
|
end
|
|
203
248
|
end
|
|
204
249
|
|
|
250
|
+
# Merge settings from the config file (YAML) into any options the user did
|
|
251
|
+
# not pass on the CLI. The CLI always wins: every assignment below only fills
|
|
252
|
+
# an ivar that is still nil/empty after parsing ARGV. Connection settings
|
|
253
|
+
# (except `adapter`) are rejected here — they belong on the CLI/env.
|
|
254
|
+
private def apply_config_file!
|
|
255
|
+
path =
|
|
256
|
+
if @config_file_path
|
|
257
|
+
unless File.file?(@config_file_path)
|
|
258
|
+
$stderr.puts "Config file not found: #{@config_file_path}"
|
|
259
|
+
exit 1
|
|
260
|
+
end
|
|
261
|
+
@config_file_path
|
|
262
|
+
else
|
|
263
|
+
DEFAULT_CONFIG_PATHS.map { |p| File.expand_path(p) }.find { |p| File.file?(p) }
|
|
264
|
+
end
|
|
265
|
+
return if path.nil?
|
|
266
|
+
|
|
267
|
+
# Paths inside the config file are resolved relative to the file's own
|
|
268
|
+
# directory (not cwd), so `schema_dir: exwiw/schema` reads naturally with the
|
|
269
|
+
# config kept at the project root, and an absolute --config works from any
|
|
270
|
+
# cwd. (CLI path flags stay cwd-relative — each source resolves relative to
|
|
271
|
+
# where it is written.) `path` is always absolute here.
|
|
272
|
+
base = File.dirname(path)
|
|
273
|
+
|
|
274
|
+
config = YAML.safe_load(File.read(path)) || {}
|
|
275
|
+
unless config.is_a?(Hash)
|
|
276
|
+
$stderr.puts "Config file must be a YAML mapping (key: value): #{path}"
|
|
277
|
+
exit 1
|
|
278
|
+
end
|
|
279
|
+
|
|
280
|
+
config.each_key do |key|
|
|
281
|
+
if REJECTED_CONNECTION_KEYS.include?(key)
|
|
282
|
+
$stderr.puts "'#{key}' is a database connection setting and must be passed via the CLI/environment, not the config file (#{path})"
|
|
283
|
+
exit 1
|
|
284
|
+
end
|
|
285
|
+
unless ALLOWED_CONFIG_KEYS.include?(key)
|
|
286
|
+
$stderr.puts "Unknown config key '#{key}' in #{path}. Allowed keys: #{ALLOWED_CONFIG_KEYS.join(', ')}"
|
|
287
|
+
exit 1
|
|
288
|
+
end
|
|
289
|
+
end
|
|
290
|
+
|
|
291
|
+
# For `explain`, drop export-only keys so a config shared with `export`
|
|
292
|
+
# does not make validate_explain_only! reject the run.
|
|
293
|
+
config = config.reject { |k, _| EXPORT_ONLY_CONFIG_KEYS.include?(k) } if @subcommand == "explain"
|
|
294
|
+
|
|
295
|
+
@database_adapter ||= config["adapter"]
|
|
296
|
+
@schema_dir ||= expand_dir(config["schema_dir"], base)
|
|
297
|
+
@output_dir ||= expand_dir(config["output_dir"], base)
|
|
298
|
+
@after_insert_hook_path ||= (File.expand_path(config["after_insert_hook"], base) if config["after_insert_hook"])
|
|
299
|
+
@output_format ||= config["output_format"]
|
|
300
|
+
@insert_only = config["insert_only"] if @insert_only.nil? && config.key?("insert_only")
|
|
301
|
+
@log_level ||= config["log_level"]&.to_sym
|
|
302
|
+
@target_table_name ||= config["target_table"]
|
|
303
|
+
@target_collection_name ||= config["target_collection"]
|
|
304
|
+
if @ids.empty? && config.key?("ids")
|
|
305
|
+
raw = config["ids"]
|
|
306
|
+
# Accept either a YAML list or a "1,2" string; coerce to strings to match
|
|
307
|
+
# the CLI's `--ids=1,2` -> ["1", "2"] shape.
|
|
308
|
+
@ids = (raw.is_a?(String) ? raw.split(",") : Array(raw)).map(&:to_s)
|
|
309
|
+
end
|
|
310
|
+
@ids_field ||= config["ids_field"]
|
|
311
|
+
@ids_column ||= config["ids_column"]
|
|
312
|
+
end
|
|
313
|
+
|
|
314
|
+
# Strip a trailing slash (like the CLI's dir options) and expand relative to
|
|
315
|
+
# `base` (the config file's directory). Returns nil for a nil value.
|
|
316
|
+
private def expand_dir(value, base)
|
|
317
|
+
return nil if value.nil?
|
|
318
|
+
value = value.end_with?("/") ? value[0..-2] : value
|
|
319
|
+
File.expand_path(value, base)
|
|
320
|
+
end
|
|
321
|
+
|
|
205
322
|
# `--target-collection` is a mongodb-only alias of `--target-table`. Fold it
|
|
206
323
|
# into @target_table_name (the single field the rest of the CLI/runner uses)
|
|
207
324
|
# after rejecting the misuses: combining it with --target-table, or using it
|
|
@@ -319,7 +436,7 @@ module Exwiw
|
|
|
319
436
|
database_user: @database_user,
|
|
320
437
|
database_password: @database_password,
|
|
321
438
|
output_dir: @output_dir,
|
|
322
|
-
|
|
439
|
+
schema_dir: @schema_dir,
|
|
323
440
|
database_adapter: @database_adapter,
|
|
324
441
|
database_name: @database_name,
|
|
325
442
|
target_table: @target_table_name,
|
|
@@ -368,9 +485,12 @@ module Exwiw
|
|
|
368
485
|
v = v.end_with?("/") ? v[0..-2] : v
|
|
369
486
|
@output_dir = File.expand_path(v)
|
|
370
487
|
end
|
|
371
|
-
opts.on("
|
|
488
|
+
opts.on("--schema-dir=SCHEMA_DIR_PATH", "Directory of schema JSON files. (or set schema_dir in the config file)") do |v|
|
|
372
489
|
v = v.end_with?("/") ? v[0..-2] : v
|
|
373
|
-
@
|
|
490
|
+
@schema_dir = File.expand_path(v)
|
|
491
|
+
end
|
|
492
|
+
opts.on("-c", "--config=CONFIG_FILE_PATH", "Path to the exwiw config YAML. Defaults to ./#{DEFAULT_CONFIG_PATHS.first} (or .#{File.extname(DEFAULT_CONFIG_PATHS.last)}) when present. CLI options take precedence; paths inside the file are resolved relative to the file.") do |v|
|
|
493
|
+
@config_file_path = File.expand_path(v)
|
|
374
494
|
end
|
|
375
495
|
opts.on("-a", "--adapter=ADAPTER", "Database adapter: mysql, sqlite, postgresql, mongodb (aliases: mysql2, sqlite3)") { |v| @database_adapter = v }
|
|
376
496
|
opts.on("--uri=URI", "Full MongoDB connection URI (mongodb:// or mongodb+srv://). mongodb adapter only; takes precedence over --host/--port/--user. TLS, replicaSet, authSource and credentials are read from the URI.") { |v| @connection_uri = v }
|
data/lib/exwiw/explain_runner.rb
CHANGED
|
@@ -4,13 +4,13 @@ module Exwiw
|
|
|
4
4
|
class ExplainRunner
|
|
5
5
|
def initialize(
|
|
6
6
|
connection_config:,
|
|
7
|
-
|
|
7
|
+
schema_dir:,
|
|
8
8
|
dump_target:,
|
|
9
9
|
logger:,
|
|
10
10
|
io: $stdout
|
|
11
11
|
)
|
|
12
12
|
@connection_config = connection_config
|
|
13
|
-
@
|
|
13
|
+
@schema_dir = schema_dir
|
|
14
14
|
@dump_target = dump_target
|
|
15
15
|
@logger = logger
|
|
16
16
|
@io = io
|
|
@@ -53,7 +53,7 @@ module Exwiw
|
|
|
53
53
|
end
|
|
54
54
|
|
|
55
55
|
private def load_table_config(klass)
|
|
56
|
-
Dir[File.join(@
|
|
56
|
+
Dir[File.join(@schema_dir, "*.json")].map do |file|
|
|
57
57
|
json = JSON.parse(File.read(file))
|
|
58
58
|
klass.from(json).reject_ignored_members!
|
|
59
59
|
end
|
|
@@ -20,6 +20,11 @@ module Exwiw
|
|
|
20
20
|
# marks an unrepresentable collection `ignore: true`, to record why extraction
|
|
21
21
|
# was skipped.
|
|
22
22
|
attribute :comment, optional(String), skip_serializing_if_nil: true
|
|
23
|
+
# Free-form tag recording *why* this collection is ignored (e.g.
|
|
24
|
+
# "need_code_fix" for an application-side bug, "unsupported" for a shape
|
|
25
|
+
# exwiw cannot express). exwiw never interprets or emits it; informational
|
|
26
|
+
# and preserved across regeneration like `comment`.
|
|
27
|
+
attribute :ignore_type, optional(String), skip_serializing_if_nil: true
|
|
23
28
|
|
|
24
29
|
# Marks this config as physically embedded inside another collection's
|
|
25
30
|
# documents. When set, this config is not processed as a standalone dump
|
|
@@ -30,6 +35,7 @@ module Exwiw
|
|
|
30
35
|
def self.from(obj)
|
|
31
36
|
instance = super
|
|
32
37
|
instance.__send__(:validate_embedded!)
|
|
38
|
+
instance.__send__(:validate_belongs_tos!)
|
|
33
39
|
instance
|
|
34
40
|
end
|
|
35
41
|
|
|
@@ -68,6 +74,7 @@ module Exwiw
|
|
|
68
74
|
merged.filter = filter
|
|
69
75
|
merged.bulk_insert_chunk_size = bulk_insert_chunk_size
|
|
70
76
|
merged.ignore = ignore
|
|
77
|
+
merged.ignore_type = ignore_type
|
|
71
78
|
# A freshly generated comment (e.g. the skip_unsupported marker) wins so
|
|
72
79
|
# it stays accurate; otherwise a hand-added note on a normal collection
|
|
73
80
|
# is kept.
|
|
@@ -84,6 +91,7 @@ module Exwiw
|
|
|
84
91
|
if receiver_bt
|
|
85
92
|
pbt.comment = receiver_bt.comment if receiver_bt.comment
|
|
86
93
|
pbt.ignore = receiver_bt.ignore unless receiver_bt.ignore.nil?
|
|
94
|
+
pbt.ignore_type = receiver_bt.ignore_type if receiver_bt.ignore_type
|
|
87
95
|
pbt.references = receiver_bt.references if receiver_bt.references
|
|
88
96
|
end
|
|
89
97
|
pbt
|
|
@@ -114,5 +122,18 @@ module Exwiw
|
|
|
114
122
|
"belongs_tos must be empty (cross-collection refs from inside embedded arrays " \
|
|
115
123
|
"are not supported)."
|
|
116
124
|
end
|
|
125
|
+
|
|
126
|
+
# `table_name` is optional only so an *ignored* relation (a stale belongs_to
|
|
127
|
+
# whose target collection no longer exists) can be recorded without one. A
|
|
128
|
+
# belongs_to that still participates in extraction must name its target.
|
|
129
|
+
private def validate_belongs_tos!
|
|
130
|
+
offender = belongs_tos.find { |bt| bt.table_name.nil? && !bt.ignore }
|
|
131
|
+
return unless offender
|
|
132
|
+
|
|
133
|
+
raise ArgumentError,
|
|
134
|
+
"MongodbCollectionConfig '#{name}' has a belongs_to (foreign_key " \
|
|
135
|
+
"'#{offender.foreign_key}') with no table_name; only an `ignore: true` belongs_to " \
|
|
136
|
+
"may omit it."
|
|
137
|
+
end
|
|
117
138
|
end
|
|
118
139
|
end
|
|
@@ -49,7 +49,7 @@ module Exwiw
|
|
|
49
49
|
end
|
|
50
50
|
|
|
51
51
|
def generate!
|
|
52
|
-
collections = build_collections
|
|
52
|
+
collections = build_collections(existing_configs_by_name(@output_dir))
|
|
53
53
|
write_files(@output_dir, collections)
|
|
54
54
|
collections
|
|
55
55
|
end
|
|
@@ -61,11 +61,33 @@ module Exwiw
|
|
|
61
61
|
# subclasses share the base's collection (Mongoid STI, discriminated by the
|
|
62
62
|
# auto-added `_type` field) collapses into a single config that aggregates
|
|
63
63
|
# every class's fields and associations. See `expand_with_descendants`.
|
|
64
|
-
|
|
64
|
+
#
|
|
65
|
+
# `existing_by_name` maps a collection name to its config already on disk, so
|
|
66
|
+
# the build can honor an explicit `ignore: true` (collection- or
|
|
67
|
+
# belongs_to-level) without re-introspecting it — and thus without aborting
|
|
68
|
+
# on a construct the user has deliberately ignored. Empty (the default) when
|
|
69
|
+
# called directly without an output dir, in which case nothing is honored.
|
|
70
|
+
def build_collections(existing_by_name = {})
|
|
65
71
|
models = expand_with_descendants(concrete_models)
|
|
66
72
|
models
|
|
67
73
|
.group_by { |model| model.collection_name.to_s }
|
|
68
|
-
.map { |collection_name, group| build_collection_for(collection_name, group) }
|
|
74
|
+
.map { |collection_name, group| build_collection_for(collection_name, group, existing_by_name[collection_name]) }
|
|
75
|
+
end
|
|
76
|
+
|
|
77
|
+
# Loads the configs already on disk so the generator can honor an explicit
|
|
78
|
+
# `ignore: true` without re-introspecting (and thus without aborting on a
|
|
79
|
+
# construct the user has deliberately ignored). A file that cannot be read or
|
|
80
|
+
# parsed is skipped — a fresh run simply has none, and write_files surfaces
|
|
81
|
+
# genuine problems when it later merges/rewrites.
|
|
82
|
+
private def existing_configs_by_name(dir)
|
|
83
|
+
return {} unless dir && File.directory?(dir)
|
|
84
|
+
|
|
85
|
+
Dir[File.join(dir, "*.json")].each_with_object({}) do |path, acc|
|
|
86
|
+
config = MongodbCollectionConfig.from(JSON.parse(File.read(path)))
|
|
87
|
+
acc[config.name] = config
|
|
88
|
+
rescue JSON::ParserError, ArgumentError
|
|
89
|
+
next
|
|
90
|
+
end
|
|
69
91
|
end
|
|
70
92
|
|
|
71
93
|
def write_files(dir, collections)
|
|
@@ -88,7 +110,15 @@ module Exwiw
|
|
|
88
110
|
# belongs_tos are unioned across the group; processing least-derived first
|
|
89
111
|
# keeps the base's fields leading the list and the output deterministic
|
|
90
112
|
# regardless of input order or sibling subclasses.
|
|
91
|
-
private def build_collection_for(collection_name, models)
|
|
113
|
+
private def build_collection_for(collection_name, models, existing = nil)
|
|
114
|
+
# An explicit on-disk `ignore: true` means the user has triaged this
|
|
115
|
+
# collection and asked exwiw to leave it alone: preserve their config
|
|
116
|
+
# (ignore_type / comment intact) and skip introspection entirely, so a
|
|
117
|
+
# construct exwiw cannot represent never aborts a run the user has already
|
|
118
|
+
# accounted for. (A collection is never dumped while ignored, so its
|
|
119
|
+
# fields/structure need not track the model.)
|
|
120
|
+
return existing if existing&.ignore
|
|
121
|
+
|
|
92
122
|
ordered = models.sort_by { |model| [model.fields.size, model.name] }
|
|
93
123
|
|
|
94
124
|
attrs = {
|
|
@@ -128,7 +158,7 @@ module Exwiw
|
|
|
128
158
|
attrs[:comment] = "exwiw could not derive embedded_in (#{reason}); marked ignore:true. Define this collection's embedded_in config by hand to dump/mask it."
|
|
129
159
|
end
|
|
130
160
|
else
|
|
131
|
-
attrs[:belongs_tos] = aggregate_belongs_tos(ordered)
|
|
161
|
+
attrs[:belongs_tos] = aggregate_belongs_tos(ordered, existing)
|
|
132
162
|
end
|
|
133
163
|
|
|
134
164
|
MongodbCollectionConfig.from_symbol_keys(attrs)
|
|
@@ -200,7 +230,9 @@ module Exwiw
|
|
|
200
230
|
end
|
|
201
231
|
end
|
|
202
232
|
|
|
203
|
-
private def aggregate_belongs_tos(models)
|
|
233
|
+
private def aggregate_belongs_tos(models, existing = nil)
|
|
234
|
+
ignored_by_fk = ignored_belongs_tos_by_foreign_key(existing)
|
|
235
|
+
|
|
204
236
|
belongs_to_assocs = models.flat_map do |model|
|
|
205
237
|
model.relations.values.select do |assoc|
|
|
206
238
|
assoc.is_a?(::Mongoid::Association::Referenced::BelongsTo)
|
|
@@ -216,10 +248,22 @@ module Exwiw
|
|
|
216
248
|
# same belongs_to twice, so uniq them.
|
|
217
249
|
belongs_to_assocs
|
|
218
250
|
.reject(&:polymorphic?)
|
|
219
|
-
.filter_map { |assoc| belongs_to_for(assoc) }
|
|
251
|
+
.filter_map { |assoc| belongs_to_for(assoc, ignored_by_fk) }
|
|
220
252
|
.uniq
|
|
221
253
|
end
|
|
222
254
|
|
|
255
|
+
# Maps foreign_key -> the on-disk `ignore: true` belongs_to entry, so a
|
|
256
|
+
# relation the user has explicitly ignored is preserved verbatim instead of
|
|
257
|
+
# re-resolved (which, for a stale relation whose target class is gone, would
|
|
258
|
+
# otherwise abort the run).
|
|
259
|
+
private def ignored_belongs_tos_by_foreign_key(existing)
|
|
260
|
+
return {} unless existing
|
|
261
|
+
|
|
262
|
+
existing.belongs_tos.select(&:ignore).each_with_object({}) do |bt, acc|
|
|
263
|
+
acc[bt.foreign_key] = bt
|
|
264
|
+
end
|
|
265
|
+
end
|
|
266
|
+
|
|
223
267
|
# Resolves a referenced belongs_to to a `{ table_name, foreign_key }` pair
|
|
224
268
|
# (plus `references` when the FK points at a non-`_id` parent field).
|
|
225
269
|
# `assoc.klass` raises NameError when the association's target class no longer
|
|
@@ -227,7 +271,18 @@ module Exwiw
|
|
|
227
271
|
# ago). Under `skip_unsupported` such a relation is skipped with a warning —
|
|
228
272
|
# its foreign-key column is still tracked as an ordinary field by
|
|
229
273
|
# `aggregate_fields`, mirroring how polymorphic / HABTM relations are dropped.
|
|
230
|
-
|
|
274
|
+
#
|
|
275
|
+
# `ignored_by_fk` carries the on-disk `ignore: true` belongs_to entries: when
|
|
276
|
+
# this relation's foreign key is among them, the user has explicitly ignored
|
|
277
|
+
# it, so preserve their entry verbatim (its `ignore_type` / `comment`) without
|
|
278
|
+
# resolving the — possibly gone — target. The relation is dropped from
|
|
279
|
+
# extraction at load (`#reject_ignored_members!`) while its FK column stays a
|
|
280
|
+
# field, and the run never aborts on a relation already triaged.
|
|
281
|
+
private def belongs_to_for(assoc, ignored_by_fk = {})
|
|
282
|
+
if (ignored = ignored_by_fk[assoc.foreign_key])
|
|
283
|
+
return preserve_ignored_belongs_to(ignored)
|
|
284
|
+
end
|
|
285
|
+
|
|
231
286
|
result = { table_name: assoc.klass.collection_name.to_s, foreign_key: assoc.foreign_key }
|
|
232
287
|
# Mongoid's `belongs_to ..., primary_key: :uuid` makes the child's foreign
|
|
233
288
|
# key reference that parent field rather than the parent's `_id`. Surface
|
|
@@ -245,6 +300,21 @@ module Exwiw
|
|
|
245
300
|
nil
|
|
246
301
|
end
|
|
247
302
|
|
|
303
|
+
# Re-emits a user's on-disk ignored belongs_to as a symbol-keyed hash (the
|
|
304
|
+
# shape `build_collection_for` feeds to `from_symbol_keys`), carrying its
|
|
305
|
+
# `ignore` / `ignore_type` / `comment` (and `table_name` / `references` when
|
|
306
|
+
# present) so the annotation survives regeneration untouched.
|
|
307
|
+
private def preserve_ignored_belongs_to(bt)
|
|
308
|
+
{
|
|
309
|
+
table_name: bt.table_name,
|
|
310
|
+
foreign_key: bt.foreign_key,
|
|
311
|
+
references: bt.references,
|
|
312
|
+
ignore: true,
|
|
313
|
+
ignore_type: bt.ignore_type,
|
|
314
|
+
comment: bt.comment,
|
|
315
|
+
}.compact
|
|
316
|
+
end
|
|
317
|
+
|
|
248
318
|
# Resolves the `embedded_in` config for an embedded model. Each embedded
|
|
249
319
|
# model points at its *immediate* embedding parent: the parent's collection
|
|
250
320
|
# name plus the single document key (`store_as`, defaulting to the relation
|
|
@@ -317,31 +387,17 @@ module Exwiw
|
|
|
317
387
|
)
|
|
318
388
|
end
|
|
319
389
|
|
|
320
|
-
# `store_as
|
|
321
|
-
# the subdocuments
|
|
322
|
-
parent_relation =
|
|
323
|
-
begin
|
|
324
|
-
parent.relations[assoc.inverse.to_s]
|
|
325
|
-
rescue ::Mongoid::Errors::MongoidError, NameError => e
|
|
326
|
-
# e.g. AmbiguousRelationship: the embedded class is embedded under
|
|
327
|
-
# several document keys in the parent (or otherwise has no single
|
|
328
|
-
# resolvable inverse), so exwiw cannot pick the one path it lives under.
|
|
329
|
-
raise UnsupportedEmbedding.new(
|
|
330
|
-
"MongoidSchemaGenerator: '#{model.name}' (collection '#{model.collection_name}') " \
|
|
331
|
-
"declares `embedded_in :#{assoc.name}` whose inverse on '#{parent.name}' is ambiguous " \
|
|
332
|
-
"or unresolvable (#{e.class}: #{e.message.lines.first&.strip}). Add an `inverse_of:` to " \
|
|
333
|
-
"disambiguate, or define the collection's config by hand.",
|
|
334
|
-
reason: "has an embedded_in :#{assoc.name} with an ambiguous/unresolvable inverse",
|
|
335
|
-
)
|
|
336
|
-
end
|
|
390
|
+
# Resolve the document key (`store_as`, defaulting to the relation name)
|
|
391
|
+
# the subdocuments live under inside the parent.
|
|
392
|
+
parent_relation = embedding_relation_in(parent, assoc, model)
|
|
337
393
|
|
|
338
394
|
unless parent_relation
|
|
339
|
-
#
|
|
340
|
-
#
|
|
395
|
+
# No embeds_one / embeds_many on the parent stores this collection, so
|
|
396
|
+
# there is no document key to embed under.
|
|
341
397
|
raise UnsupportedEmbedding.new(
|
|
342
398
|
"MongoidSchemaGenerator: '#{model.name}' (collection '#{model.collection_name}') " \
|
|
343
|
-
"declares `embedded_in :#{assoc.name}` but
|
|
344
|
-
"
|
|
399
|
+
"declares `embedded_in :#{assoc.name}` but no embeds_one/embeds_many on '#{parent.name}' " \
|
|
400
|
+
"stores this collection (the embedding document key is indeterminable). Add an `inverse_of:`, or " \
|
|
345
401
|
"define the collection's config by hand.",
|
|
346
402
|
reason: "has an embedded_in :#{assoc.name} whose inverse relation could not be located",
|
|
347
403
|
)
|
|
@@ -350,6 +406,64 @@ module Exwiw
|
|
|
350
406
|
{ collection_name: parent.collection_name.to_s, path: parent_relation.store_as }
|
|
351
407
|
end
|
|
352
408
|
|
|
409
|
+
# Locates the parent's `embeds_one` / `embeds_many` association that stores
|
|
410
|
+
# this embedded collection — i.e. the document key the subdocuments live
|
|
411
|
+
# under. Mongoid's computed `assoc.inverse` is preferred when it resolves
|
|
412
|
+
# cleanly, but it is frequently `nil` (no explicit `inverse_of:` and Mongoid
|
|
413
|
+
# declines to infer one) or raises `AmbiguousRelationship`; in those cases
|
|
414
|
+
# fall back to matching the parent's embedding relations by the collection
|
|
415
|
+
# they store. This resolves the common single-embedding case that
|
|
416
|
+
# `assoc.inverse` cannot (e.g. an `embeds_one :force_logout` / `embedded_in
|
|
417
|
+
# :customer` pair with no inverse_of). Returns the relation, `nil` when none
|
|
418
|
+
# stores this collection, and raises `UnsupportedEmbedding` when several
|
|
419
|
+
# distinct keys do (genuinely ambiguous — exwiw cannot pick one).
|
|
420
|
+
private def embedding_relation_in(parent, assoc, model)
|
|
421
|
+
inverse_name =
|
|
422
|
+
begin
|
|
423
|
+
assoc.inverse
|
|
424
|
+
rescue ::Mongoid::Errors::MongoidError, NameError
|
|
425
|
+
nil
|
|
426
|
+
end
|
|
427
|
+
|
|
428
|
+
if inverse_name
|
|
429
|
+
rel = parent.relations[inverse_name.to_s]
|
|
430
|
+
return rel if rel
|
|
431
|
+
end
|
|
432
|
+
|
|
433
|
+
candidates = parent.relations.values.select do |rel|
|
|
434
|
+
(rel.is_a?(::Mongoid::Association::Embedded::EmbedsMany) ||
|
|
435
|
+
rel.is_a?(::Mongoid::Association::Embedded::EmbedsOne)) &&
|
|
436
|
+
embeds_collection?(rel, model)
|
|
437
|
+
end
|
|
438
|
+
paths = candidates.map(&:store_as).uniq
|
|
439
|
+
|
|
440
|
+
if paths.size > 1
|
|
441
|
+
# The same collection is embedded under several document keys in the
|
|
442
|
+
# parent, so `embedded_in :#{assoc.name}` has no single resolvable path.
|
|
443
|
+
raise UnsupportedEmbedding.new(
|
|
444
|
+
"MongoidSchemaGenerator: '#{model.name}' (collection '#{model.collection_name}') " \
|
|
445
|
+
"is embedded under multiple document keys (#{paths.join(', ')}) in '#{parent.name}', so its " \
|
|
446
|
+
"`embedded_in :#{assoc.name}` is ambiguous or unresolvable — exwiw cannot pick the single path " \
|
|
447
|
+
"it lives under. Add an `inverse_of:` to disambiguate, or define the collection's config by hand.",
|
|
448
|
+
reason: "has an embedded_in :#{assoc.name} with an ambiguous/unresolvable inverse",
|
|
449
|
+
)
|
|
450
|
+
end
|
|
451
|
+
|
|
452
|
+
candidates.first
|
|
453
|
+
end
|
|
454
|
+
|
|
455
|
+
# True when `rel` (an embeds_one / embeds_many on the parent) stores the same
|
|
456
|
+
# collection as `model`. Comparing collection names (rather than class
|
|
457
|
+
# identity) also matches an STI subclass embedded through a relation declared
|
|
458
|
+
# against its base class, since both share the base's collection. A sibling
|
|
459
|
+
# embedding relation whose target class no longer resolves is treated as a
|
|
460
|
+
# non-match rather than blowing up the whole derivation.
|
|
461
|
+
private def embeds_collection?(rel, model)
|
|
462
|
+
rel.klass.collection_name.to_s == model.collection_name.to_s
|
|
463
|
+
rescue NameError, ::Mongoid::Errors::MongoidError
|
|
464
|
+
false
|
|
465
|
+
end
|
|
466
|
+
|
|
353
467
|
private def embedded_in_association(model)
|
|
354
468
|
model.relations.values.find do |assoc|
|
|
355
469
|
assoc.is_a?(::Mongoid::Association::Embedded::EmbeddedIn)
|
data/lib/exwiw/runner.rb
CHANGED
|
@@ -7,7 +7,7 @@ module Exwiw
|
|
|
7
7
|
def initialize(
|
|
8
8
|
connection_config:,
|
|
9
9
|
output_dir:,
|
|
10
|
-
|
|
10
|
+
schema_dir:,
|
|
11
11
|
dump_target:,
|
|
12
12
|
logger:,
|
|
13
13
|
output_format: 'insert',
|
|
@@ -17,7 +17,7 @@ module Exwiw
|
|
|
17
17
|
)
|
|
18
18
|
@connection_config = connection_config
|
|
19
19
|
@output_dir = output_dir
|
|
20
|
-
@
|
|
20
|
+
@schema_dir = schema_dir
|
|
21
21
|
@dump_target = dump_target
|
|
22
22
|
@output_format = output_format
|
|
23
23
|
@insert_only = insert_only
|
|
@@ -159,7 +159,7 @@ module Exwiw
|
|
|
159
159
|
end
|
|
160
160
|
|
|
161
161
|
private def load_table_config(klass)
|
|
162
|
-
Dir[File.join(@
|
|
162
|
+
Dir[File.join(@schema_dir, "*.json")].map do |file|
|
|
163
163
|
json = JSON.parse(File.read(file))
|
|
164
164
|
# Drop belongs_tos/columns(fields) flagged ignore:true so they are not
|
|
165
165
|
# considered during extraction. Done here (after loading from file)
|
data/lib/exwiw/version.rb
CHANGED
data/lib/tasks/exwiw.rake
CHANGED
|
@@ -7,7 +7,7 @@ namespace :exwiw do
|
|
|
7
7
|
require "exwiw"
|
|
8
8
|
|
|
9
9
|
Exwiw::SchemaGenerator.from_rails_application(
|
|
10
|
-
output_dir: ENV["
|
|
10
|
+
output_dir: ENV["EXWIW_SCHEMA_DIR_PATH"] || "exwiw/schema",
|
|
11
11
|
).generate!
|
|
12
12
|
end
|
|
13
13
|
|
|
@@ -16,7 +16,7 @@ namespace :exwiw do
|
|
|
16
16
|
require "exwiw"
|
|
17
17
|
|
|
18
18
|
result = Exwiw::SchemaGenerator.from_rails_application(
|
|
19
|
-
output_dir: ENV["
|
|
19
|
+
output_dir: ENV["EXWIW_SCHEMA_DIR_PATH"] || "exwiw/schema",
|
|
20
20
|
).tidy!
|
|
21
21
|
|
|
22
22
|
if result.empty?
|
|
@@ -32,16 +32,22 @@ namespace :exwiw do
|
|
|
32
32
|
end
|
|
33
33
|
|
|
34
34
|
desc "Generate schema from a Mongoid application"
|
|
35
|
-
#
|
|
36
|
-
#
|
|
37
|
-
# unresolvable `embedded_in`)
|
|
35
|
+
# Fail-loud by default: the task aborts on a construct exwiw cannot represent
|
|
36
|
+
# (an unresolvable `belongs_to`, or a polymorphic / cyclic / ambiguous /
|
|
37
|
+
# unresolvable `embedded_in`). To keep a deliberately-unrepresentable
|
|
38
|
+
# collection or relation from aborting the run, mark it `ignore: true` in its
|
|
39
|
+
# config on disk (optionally with an `ignore_type` / `comment` recording why);
|
|
40
|
+
# the generator honors that and skips re-introspecting it. Set
|
|
41
|
+
# EXWIW_SKIP_UNSUPPORTED=1 to additionally keep going past *un-annotated*
|
|
42
|
+
# unrepresentable constructs (the unresolvable belongs_to is skipped and an
|
|
38
43
|
# unrepresentable embedded collection is emitted as `ignore: true` with a
|
|
39
|
-
# `comment`, each warned to stderr
|
|
44
|
+
# `comment`, each warned to stderr) — useful for the first bootstrap pass
|
|
45
|
+
# against a large app before the ignores are written.
|
|
40
46
|
task generate_mongoid: :environment do
|
|
41
47
|
require "exwiw"
|
|
42
48
|
|
|
43
49
|
Exwiw::MongoidSchemaGenerator.from_rails_application(
|
|
44
|
-
output_dir: ENV["
|
|
50
|
+
output_dir: ENV["EXWIW_SCHEMA_DIR_PATH"] || "exwiw/schema",
|
|
45
51
|
skip_unsupported: ENV["EXWIW_SKIP_UNSUPPORTED"] == "1",
|
|
46
52
|
).generate!
|
|
47
53
|
end
|