exwiw 0.2.8 → 0.2.9
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +12 -0
- data/README.md +22 -0
- data/lib/exwiw/adapter/mongodb_adapter.rb +48 -9
- data/lib/exwiw/cli.rb +45 -0
- data/lib/exwiw/mongodb_collection_config.rb +33 -0
- data/lib/exwiw/mongodb_field.rb +5 -0
- data/lib/exwiw/mongoid_schema_generator.rb +240 -0
- data/lib/exwiw/query_ast_builder.rb +4 -0
- data/lib/exwiw/version.rb +1 -1
- data/lib/exwiw.rb +5 -1
- data/lib/tasks/exwiw.rake +9 -0
- metadata +2 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 7e5d922e740407599ecb3fa3992e0de402434bacfb38ca817a189a53acf16ab3
|
|
4
|
+
data.tar.gz: 2a8919778bb9395434587ebb69f49f4d8445cb1890e0999cb065b5577d802532
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: e38a240087564c3e3909106268fba7ad3a8dce881924f8b210b1733d65f0bc12d3bb514af80be21e6f0ad39707f99e96edbaed3cc7c54e2d26677c3a0c7d203f
|
|
7
|
+
data.tar.gz: cca54067266034f8df074fce301623fd6c0c860e9fd1a13eac43d930b89f46a7b63d3e6230d155abcce3f6ab021c2299d518f8131d26ee5bcbe7c0a535093ee9
|
data/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,18 @@
|
|
|
2
2
|
|
|
3
3
|
## [Unreleased]
|
|
4
4
|
|
|
5
|
+
## [0.2.9] - 2026-05-31
|
|
6
|
+
|
|
7
|
+
### Added
|
|
8
|
+
|
|
9
|
+
- New `--ids-field=FIELD` CLI option matches `--ids` against an arbitrary field on the target collection instead of its primary key (e.g. `--target-collection=users --ids=a@example.com --ids-field=email`). Only the target collection's filter changes — downstream foreign-key propagation still keys off the primary key. Unlike the primary-key path, the supplied ids are **not** type-coerced (a custom field's stored type is unknown, so values are passed through as-is). Currently **mongodb-only**: the SQL adapters (mysql2/postgresql/sqlite3) reject the flag at validation time, and threading `ids_field` through `QueryAstBuilder` for them is left as a TODO.
|
|
10
|
+
- New `--target-collection=COLLECTION` CLI option, a mongodb-only alias of `--target-table`. Specifying both, or using `--target-collection` with a non-mongodb adapter, is rejected at validation time.
|
|
11
|
+
- New rake task `exwiw:schema:generate_mongoid` (backed by `Exwiw::MongoidSchemaGenerator`) generates `MongodbCollectionConfig` files by introspecting Mongoid document models — a separate task/class from the ActiveRecord `schema:generate` because the ORMs expose different metadata. It derives the collection name, the `_id` primary key, `fields` (including referenced `belongs_to` foreign keys), `belongs_tos` from referenced `belongs_to` associations, and `embedded_in` from `embedded_in` / `embeds_many` / `embeds_one` associations (each embedded config names its immediate parent collection and `store_as` document key; nested embedding is emitted as a chain — `comments` embedded_in `posts`, `posts` embedded_in `users` — so the adapter can recurse through both array and Hash subdocuments). Regeneration preserves hand-edited `replace_with` / `filter` / `skip` / `bulk_insert_chunk_size`. Polymorphic `belongs_to` is not yet expanded. Models in an inheritance hierarchy whose subclasses share the base's collection (Mongoid STI, `_type` discriminator) collapse into a single config: subclasses are discovered via `descendants` (Mongoid registers only the base in `Mongoid.models`) and every class's `fields` / `belongs_tos` are unioned, so subclass-only fields and associations are preserved. A referenced `belongs_to` declared on an *embedded* document (e.g. `Comment embedded_in :post, belongs_to :author`) is dropped from the embedded config's `belongs_tos` (cross-collection refs from inside embedded subdocuments are unsupported and rejected on load), while its foreign-key column is still kept as an ordinary field. A `has_and_belongs_to_many` association is likewise dropped from `belongs_tos` (its foreign keys are stored as an array field such as `tag_ids`, which exwiw cannot follow as a single-valued foreign key), while that `*_ids` array column is kept as an ordinary field. A *polymorphic* `embedded_in` (`embedded_in :addressable, polymorphic: true`) has no single embedding parent collection and cannot be expressed as an `embedded_in` config, so the generator raises a clear, actionable error rather than crashing on the unresolvable parent class. A *self-referential / cyclic* embedding (Mongoid's `recursively_embeds_many` / `recursively_embeds_one`) makes a collection both top-level and embedded inside documents of its own type; since exwiw represents a collection as either top-level or embedded (not both), the generator likewise raises a clear error rather than emit an `embedded_in` config that would silently make the collection undumpable. The `created_at` / `updated_at` columns added by `include Mongoid::Timestamps` are tracked as ordinary fields, and their BSON `ObjectId` / `Date` values (the shape a live `find` returns) serialize as MongoDB Extended JSON (`$oid` / `$date`) through the dump path — now covered end-to-end against the generated configs. An aliased field (`field :ctry, as: :country`) is emitted by its **stored** document key (`ctry`), never the Ruby accessor (`country`), so masking and projection target the key that actually appears in the document; the accessor is additionally surfaced as `mongoid_field_name` on that field so the otherwise cryptic short key stays understandable (association aliases such as `shop => shop_id` and the built-in `id => _id` are not field renames and are not annotated).
|
|
12
|
+
|
|
13
|
+
### Fixed
|
|
14
|
+
|
|
15
|
+
- MongoDB adapter: `--ids` filtering against an `ObjectId` `_id` now works. `--ids` arrives as text and MongoDB compares types strictly, so a 24-char hex id is coerced to `BSON::ObjectId` (a plain String would never match). Integer-looking ids are still coerced to `Integer` and other strings (e.g. a String/UUID `_id`) are left as-is. This makes the `MongoidSchemaGenerator`-emitted `"primary_key": "_id"` configs usable end-to-end for the common case where `_id` is Mongoid's default `ObjectId`.
|
|
16
|
+
|
|
5
17
|
## [0.2.8] - 2026-05-31
|
|
6
18
|
|
|
7
19
|
### Added
|
data/README.md
CHANGED
|
@@ -148,6 +148,25 @@ Each database keeps its own Rails migration history, so a `schema_migrations` (a
|
|
|
148
148
|
|
|
149
149
|
- The rails-managed table *names* are resolved from the global `ActiveRecord::Base.schema_migrations_table_name` / `internal_metadata_table_name` accessors, which are shared across all connections. A per-database override of these names is not detected, so such a table will be missing from that database's generated configs.
|
|
150
150
|
|
|
151
|
+
#### Mongoid applications
|
|
152
|
+
|
|
153
|
+
For MongoDB applications backed by [Mongoid](https://www.mongodb.com/docs/mongoid/), a separate rake task introspects Mongoid document models and emits `MongodbCollectionConfig` files (the `fields` / `_id` / `embedded_in` shape described under [MongoDB notes](#mongodb-notes)):
|
|
154
|
+
|
|
155
|
+
```bash
|
|
156
|
+
bundle exec rake exwiw:schema:generate_mongoid
|
|
157
|
+
```
|
|
158
|
+
|
|
159
|
+
It is a distinct task and class (`Exwiw::MongoidSchemaGenerator`) from the ActiveRecord generator because the two ORMs expose entirely different metadata. From each model it derives:
|
|
160
|
+
|
|
161
|
+
- the collection name and the `_id` primary key,
|
|
162
|
+
- `fields` from the declared Mongoid fields (referenced `belongs_to` foreign keys such as `shop_id`, and the `created_at` / `updated_at` columns added by `Mongoid::Timestamps`, are ordinary fields — their BSON `ObjectId` / `Date` values serialize as MongoDB Extended JSON at dump time). For an aliased field (`field :ctry, as: :country`), the generator emits the **stored** document key (`ctry`), never the Ruby accessor (`country`), so masking and projection target the key that actually appears in the document, and additionally records the accessor as `mongoid_field_name` on that field so the short key stays understandable (association aliases such as `shop => shop_id` and the built-in `id => _id` are not field renames and are not annotated),
|
|
163
|
+
- `belongs_tos` from referenced `belongs_to` associations (`{ table_name, foreign_key }`). A referenced `belongs_to` declared on an *embedded* document is dropped (cross-collection refs from inside embedded subdocuments are unsupported — see [MongoDB notes](#mongodb-notes)), but its foreign-key column is still kept as an ordinary field. A `has_and_belongs_to_many` association is also dropped (its foreign keys are stored as an array field, e.g. `tag_ids`, which exwiw cannot follow as a single-valued foreign key), while that `*_ids` array column is kept as an ordinary field,
|
|
164
|
+
- `embedded_in` from `embedded_in` / `embeds_many` / `embeds_one` associations. Each embedded config names its *immediate* parent collection and the document key it lives under (`store_as`, defaulting to the relation name); nested embedding is represented as a chain (`comments` → `embedded_in` `posts`, `posts` → `embedded_in` `users`) rather than a flattened dot-path, matching how the adapter recurses through array and Hash subdocuments. A *polymorphic* `embedded_in` (`embedded_in :addressable, polymorphic: true`) has no single embedding parent collection and so cannot be expressed as an `embedded_in` config; the generator raises a clear error pointing you to define that collection's config by hand. A *self-referential / cyclic* embedding (Mongoid's `recursively_embeds_many` / `recursively_embeds_one`) makes a collection both a top-level document and embedded inside documents of its own type; exwiw represents a collection as either top-level or embedded, not both, so the generator likewise raises a clear error rather than emit a config that would silently make the collection undumpable.
|
|
165
|
+
|
|
166
|
+
Models in an inheritance hierarchy whose subclasses share the base's collection (Mongoid STI, distinguished by the auto-added `_type` discriminator) collapse into a single config: the generator discovers the subclasses via `descendants` (Mongoid registers only the base class in `Mongoid.models`) and unions every class's `fields` and `belongs_tos` into the collection config, so subclass-only fields and associations are not lost.
|
|
167
|
+
|
|
168
|
+
Regeneration preserves hand-edited `replace_with`, `filter`, `skip`, and `bulk_insert_chunk_size` values, like the ActiveRecord generator. Indexes are not written to the config — they are introspected from the live database at dump time (see [MongoDB notes](#mongodb-notes)). Polymorphic `belongs_to` is not yet expanded by this task.
|
|
169
|
+
|
|
151
170
|
### Configuration
|
|
152
171
|
|
|
153
172
|
This is an example of the one table schema:
|
|
@@ -391,6 +410,9 @@ The MongoDB adapter is experimental. To use it:
|
|
|
391
410
|
- Add `gem "mongo"` to your Gemfile in addition to `exwiw` (it is not declared as a runtime dependency of the gem).
|
|
392
411
|
- Set `--adapter=mongodb`. `--user` / `DATABASE_PASSWORD` are optional and only needed when your MongoDB requires authentication.
|
|
393
412
|
- The MongoDB adapter consumes a separate config type, `MongodbCollectionConfig`, with MongoDB-native naming. Use `fields` (instead of the SQL adapters' `columns`), and set `"primary_key": "_id"`. Foreign keys (`shop_id`, `user_id`, ...) stay as ordinary fields.
|
|
413
|
+
- `--ids` values are coerced to the type actually stored in `_id` before filtering: integer-looking ids become `Integer`, 24-char hex ids become `BSON::ObjectId` (Mongoid's default `_id` type — a plain String would never match an ObjectId), and any other string is left as-is.
|
|
414
|
+
- `--target-collection=COLLECTION` is a mongodb-only alias of `--target-table` (use whichever reads better for MongoDB). Specifying both, or using `--target-collection` with a non-mongodb adapter, is an error.
|
|
415
|
+
- `--ids-field=FIELD` matches `--ids` against `FIELD` on the target collection instead of its primary key (e.g. `--target-collection=users --ids=a@example.com --ids-field=email`). Downstream foreign-key propagation still keys off the primary key, so only the target collection's filter changes. Unlike the primary-key path, the supplied ids are **not** type-coerced (the stored type of a custom field is unknown), so pass values matching the field's actual type. This flag is currently **mongodb-only** (the SQL adapters reject it; supporting them is a TODO).
|
|
394
416
|
- Output is JSON Lines (`insert-{idx}-{collection}.jsonl`) using MongoDB Extended JSON (relaxed mode). Import with `mongoimport`:
|
|
395
417
|
```bash
|
|
396
418
|
mongoimport --db app_dev --collection users --file dump/insert-002-users.jsonl
|
|
@@ -47,7 +47,22 @@ module Exwiw
|
|
|
47
47
|
|
|
48
48
|
filter =
|
|
49
49
|
if config.name == dump_target.table_name
|
|
50
|
-
|
|
50
|
+
# `--ids-field` may override which field --ids is matched against;
|
|
51
|
+
# otherwise fall back to the primary key. Note this only changes the
|
|
52
|
+
# WHERE filter on the target collection — downstream foreign-key
|
|
53
|
+
# propagation still keys off `primary_key` (see #execute, which
|
|
54
|
+
# stashes doc[primary_key] into @state).
|
|
55
|
+
#
|
|
56
|
+
# Type coercion is only applied to the primary key (`_id`), whose
|
|
57
|
+
# stored type we know (Mongoid's default ObjectId). For a custom
|
|
58
|
+
# `ids_field` the stored type is unknown, so the textual --ids are
|
|
59
|
+
# left as Strings rather than guessed at — the caller passes values
|
|
60
|
+
# matching the field's actual type.
|
|
61
|
+
if dump_target.ids_field
|
|
62
|
+
{ dump_target.ids_field => { "$in" => dump_target.ids } }
|
|
63
|
+
else
|
|
64
|
+
{ config.primary_key => { "$in" => coerce_ids(dump_target.ids) } }
|
|
65
|
+
end
|
|
51
66
|
else
|
|
52
67
|
constrained = config.belongs_tos.select do |relation|
|
|
53
68
|
@state.key?(relation.table_name) && !@state[relation.table_name].empty?
|
|
@@ -155,18 +170,42 @@ module Exwiw
|
|
|
155
170
|
end
|
|
156
171
|
|
|
157
172
|
# `--ids` from the CLI arrives as Strings. Mongo compares types strictly,
|
|
158
|
-
# so
|
|
159
|
-
#
|
|
173
|
+
# so the textual ids must be coerced to the type actually stored in `_id`:
|
|
174
|
+
#
|
|
175
|
+
# - integer-looking ids -> Integer
|
|
176
|
+
# - 24-char hex ids -> BSON::ObjectId (Mongoid's default `_id` type; a
|
|
177
|
+
# plain String would never match an ObjectId in a `$in` filter)
|
|
178
|
+
# - anything else (e.g. a String/UUID `_id`) is left as-is
|
|
179
|
+
#
|
|
180
|
+
# Only used for the primary-key filter; a custom `--ids-field` skips this
|
|
181
|
+
# because its stored type is unknown (see build_query).
|
|
160
182
|
private def coerce_ids(ids)
|
|
161
|
-
Array(ids).map
|
|
162
|
-
|
|
163
|
-
|
|
164
|
-
|
|
165
|
-
|
|
166
|
-
|
|
183
|
+
Array(ids).map { |id| coerce_id(id) }
|
|
184
|
+
end
|
|
185
|
+
|
|
186
|
+
private def coerce_id(id)
|
|
187
|
+
return id unless id.is_a?(String)
|
|
188
|
+
|
|
189
|
+
if id.match?(/\A-?\d+\z/)
|
|
190
|
+
id.to_i
|
|
191
|
+
elsif object_id_hex?(id)
|
|
192
|
+
BSON::ObjectId.from_string(id)
|
|
193
|
+
else
|
|
194
|
+
id
|
|
167
195
|
end
|
|
168
196
|
end
|
|
169
197
|
|
|
198
|
+
# True when `str` is a canonical 24-char hex ObjectId. `bson` ships with
|
|
199
|
+
# `mongo`/`mongoid` but may not be loaded yet when build_query runs before
|
|
200
|
+
# any db access, so require it lazily; if it is genuinely unavailable we
|
|
201
|
+
# fall back to leaving the id as a String.
|
|
202
|
+
private def object_id_hex?(str)
|
|
203
|
+
require 'bson' unless defined?(::BSON::ObjectId)
|
|
204
|
+
::BSON::ObjectId.legal?(str)
|
|
205
|
+
rescue LoadError
|
|
206
|
+
false
|
|
207
|
+
end
|
|
208
|
+
|
|
170
209
|
private def reject_filter!(config)
|
|
171
210
|
return if config.filter.nil? || config.filter.to_s.empty?
|
|
172
211
|
|
data/lib/exwiw/cli.rb
CHANGED
|
@@ -37,7 +37,9 @@ module Exwiw
|
|
|
37
37
|
@database_adapter = nil
|
|
38
38
|
@database_name = nil
|
|
39
39
|
@target_table_name = nil
|
|
40
|
+
@target_collection_name = nil
|
|
40
41
|
@ids = []
|
|
42
|
+
@ids_field = nil
|
|
41
43
|
@output_format = nil
|
|
42
44
|
@insert_only = nil
|
|
43
45
|
@after_insert_hook_path = nil
|
|
@@ -66,6 +68,7 @@ module Exwiw
|
|
|
66
68
|
dump_target = DumpTarget.new(
|
|
67
69
|
table_name: @target_table_name,
|
|
68
70
|
ids: @ids,
|
|
71
|
+
ids_field: @ids_field,
|
|
69
72
|
)
|
|
70
73
|
|
|
71
74
|
logger = build_logger
|
|
@@ -95,6 +98,8 @@ module Exwiw
|
|
|
95
98
|
end
|
|
96
99
|
|
|
97
100
|
private def validate_options!
|
|
101
|
+
resolve_target_collection_alias!
|
|
102
|
+
|
|
98
103
|
if @subcommand == "explain"
|
|
99
104
|
validate_explain_only!
|
|
100
105
|
end
|
|
@@ -167,6 +172,23 @@ module Exwiw
|
|
|
167
172
|
exit 1
|
|
168
173
|
end
|
|
169
174
|
|
|
175
|
+
if @ids_field
|
|
176
|
+
# --ids-field overrides the field --ids filters against on the target
|
|
177
|
+
# table; it is meaningless without a target table to constrain.
|
|
178
|
+
if !@target_table_name
|
|
179
|
+
$stderr.puts "--target-table is required when --ids-field is specified"
|
|
180
|
+
exit 1
|
|
181
|
+
end
|
|
182
|
+
|
|
183
|
+
# TODO: support --ids-field for the sql adapters (mysql2/postgresql/
|
|
184
|
+
# sqlite3) by threading dump_target.ids_field through QueryAstBuilder's
|
|
185
|
+
# WHERE clause on the target table. For now it is mongodb-only.
|
|
186
|
+
if @database_adapter != "mongodb"
|
|
187
|
+
$stderr.puts "--ids-field is currently only supported by the mongodb adapter"
|
|
188
|
+
exit 1
|
|
189
|
+
end
|
|
190
|
+
end
|
|
191
|
+
|
|
170
192
|
if @after_insert_hook_path
|
|
171
193
|
unless File.file?(@after_insert_hook_path)
|
|
172
194
|
$stderr.puts "--after-insert-hook file not found: #{@after_insert_hook_path}"
|
|
@@ -181,6 +203,26 @@ module Exwiw
|
|
|
181
203
|
end
|
|
182
204
|
end
|
|
183
205
|
|
|
206
|
+
# `--target-collection` is a mongodb-only alias of `--target-table`. Fold it
|
|
207
|
+
# into @target_table_name (the single field the rest of the CLI/runner uses)
|
|
208
|
+
# after rejecting the misuses: combining it with --target-table, or using it
|
|
209
|
+
# with a non-mongodb adapter.
|
|
210
|
+
private def resolve_target_collection_alias!
|
|
211
|
+
return if @target_collection_name.nil?
|
|
212
|
+
|
|
213
|
+
if @target_table_name
|
|
214
|
+
$stderr.puts "Specify only one of --target-table and --target-collection"
|
|
215
|
+
exit 1
|
|
216
|
+
end
|
|
217
|
+
|
|
218
|
+
if @database_adapter != "mongodb"
|
|
219
|
+
$stderr.puts "--target-collection is only supported by the mongodb adapter (use --target-table)"
|
|
220
|
+
exit 1
|
|
221
|
+
end
|
|
222
|
+
|
|
223
|
+
@target_table_name = @target_collection_name
|
|
224
|
+
end
|
|
225
|
+
|
|
184
226
|
private def validate_explain_only!
|
|
185
227
|
if @database_adapter == "mongodb"
|
|
186
228
|
$stderr.puts "mongodb adapter is not yet supported by 'explain' subcommand"
|
|
@@ -211,6 +253,7 @@ module Exwiw
|
|
|
211
253
|
database_name: @database_name,
|
|
212
254
|
target_table: @target_table_name,
|
|
213
255
|
ids: @ids.dup.freeze,
|
|
256
|
+
ids_field: @ids_field,
|
|
214
257
|
output_format: @output_format,
|
|
215
258
|
insert_only: @insert_only,
|
|
216
259
|
log_level: @log_level,
|
|
@@ -261,7 +304,9 @@ module Exwiw
|
|
|
261
304
|
opts.on("-a", "--adapter=ADAPTER", "Database adapter") { |v| @database_adapter = v }
|
|
262
305
|
opts.on("--database=DATABASE", "Target database name") { |v| @database_name = v }
|
|
263
306
|
opts.on("--target-table=[TABLE]", "Target table for extraction. If omitted, dump all tables.") { |v| @target_table_name = v }
|
|
307
|
+
opts.on("--target-collection=[COLLECTION]", "Alias of --target-table for the mongodb adapter.") { |v| @target_collection_name = v }
|
|
264
308
|
opts.on("--ids=[IDS]", "Comma-separated list of identifiers. Required when --target-table is given.") { |v| @ids = v.split(',') }
|
|
309
|
+
opts.on("--ids-field=[FIELD]", "Field on the target table that --ids is matched against. Defaults to the primary key. (mongodb adapter only)") { |v| @ids_field = v }
|
|
265
310
|
opts.on("--output-format=[FORMAT]", "Output format: insert (default) or copy (PostgreSQL only, dump subcommand only)") { |v| @output_format = v }
|
|
266
311
|
opts.on("--insert-only", "Do not generate DELETE SQL files (dump subcommand only)") { @insert_only = true }
|
|
267
312
|
opts.on("--after-insert-hook=PATH", "Path to a .rb or .sh post-processing hook executed after all insert/delete files are written (dump subcommand only)") do |v|
|
|
@@ -35,6 +35,39 @@ module Exwiw
|
|
|
35
35
|
!embedded_in.nil?
|
|
36
36
|
end
|
|
37
37
|
|
|
38
|
+
# Merge an auto-generated config (`passed`) into this user-maintained one so
|
|
39
|
+
# that `MongoidSchemaGenerator` regenerations preserve hand-edited values.
|
|
40
|
+
#
|
|
41
|
+
# - structural facts come from the freshly generated config: primary_key,
|
|
42
|
+
# belongs_tos, embedded_in.
|
|
43
|
+
# - user customizations are kept from the receiver: filter, skip,
|
|
44
|
+
# bulk_insert_chunk_size, and each field's `replace_with` masking rule.
|
|
45
|
+
# - generated fields drive the field list (so added/removed fields track the
|
|
46
|
+
# model), but a matching receiver field wins to retain its masking.
|
|
47
|
+
def merge(passed)
|
|
48
|
+
return passed if passed.to_hash == to_hash
|
|
49
|
+
|
|
50
|
+
MongodbCollectionConfig.new.tap do |merged|
|
|
51
|
+
merged.name = name
|
|
52
|
+
merged.primary_key = passed.primary_key
|
|
53
|
+
merged.filter = filter
|
|
54
|
+
merged.belongs_tos = passed.belongs_tos
|
|
55
|
+
merged.bulk_insert_chunk_size = bulk_insert_chunk_size
|
|
56
|
+
merged.skip = skip
|
|
57
|
+
merged.embedded_in = passed.embedded_in
|
|
58
|
+
|
|
59
|
+
# Take each field from the freshly generated config (so structural facts
|
|
60
|
+
# like `mongoid_field_name` track the model) but carry over the user's
|
|
61
|
+
# hand-edited `replace_with` masking when the field still exists.
|
|
62
|
+
receiver_field_by_name = fields.each_with_object({}) { |f, h| h[f.name] = f }
|
|
63
|
+
merged.fields = passed.fields.map do |pf|
|
|
64
|
+
receiver = receiver_field_by_name[pf.name]
|
|
65
|
+
pf.replace_with = receiver.replace_with if receiver&.replace_with
|
|
66
|
+
pf
|
|
67
|
+
end
|
|
68
|
+
end
|
|
69
|
+
end
|
|
70
|
+
|
|
38
71
|
private def validate_embedded!
|
|
39
72
|
return unless embedded?
|
|
40
73
|
return if belongs_tos.empty?
|
data/lib/exwiw/mongodb_field.rb
CHANGED
|
@@ -6,6 +6,11 @@ module Exwiw
|
|
|
6
6
|
|
|
7
7
|
attribute :name, String
|
|
8
8
|
attribute :replace_with, optional(String), skip_serializing_if_nil: true
|
|
9
|
+
# The Mongoid model's Ruby accessor when the stored document key (`name`)
|
|
10
|
+
# was renamed via `field :ctry, as: :country`. Purely informational — exwiw
|
|
11
|
+
# masks/projects by `name` (the storage key) — but surfacing the accessor
|
|
12
|
+
# keeps an otherwise cryptic short key understandable in the config.
|
|
13
|
+
attribute :mongoid_field_name, optional(String), skip_serializing_if_nil: true
|
|
9
14
|
|
|
10
15
|
def self.from_symbol_keys(hash)
|
|
11
16
|
from(hash.transform_keys(&:to_s))
|
|
@@ -0,0 +1,240 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require "fileutils"
|
|
4
|
+
require "json"
|
|
5
|
+
|
|
6
|
+
module Exwiw
|
|
7
|
+
# Generates exwiw `MongodbCollectionConfig` files by introspecting Mongoid
|
|
8
|
+
# document models. This is the MongoDB/Mongoid counterpart of
|
|
9
|
+
# `SchemaGenerator` (which targets ActiveRecord); it is intentionally a
|
|
10
|
+
# separate class and rake task because the two ORMs expose entirely
|
|
11
|
+
# different metadata APIs.
|
|
12
|
+
#
|
|
13
|
+
# Introspection relies only on class-level Mongoid metadata
|
|
14
|
+
# (`fields`, `relations`, `collection_name`), so it does not require a live
|
|
15
|
+
# MongoDB connection.
|
|
16
|
+
class MongoidSchemaGenerator
|
|
17
|
+
def self.from_rails_application(output_dir:)
|
|
18
|
+
Rails.application.eager_load!
|
|
19
|
+
new(models: ::Mongoid.models, output_dir: output_dir)
|
|
20
|
+
end
|
|
21
|
+
|
|
22
|
+
def initialize(models:, output_dir:)
|
|
23
|
+
@models = models
|
|
24
|
+
@output_dir = output_dir
|
|
25
|
+
end
|
|
26
|
+
|
|
27
|
+
def generate!
|
|
28
|
+
collections = build_collections
|
|
29
|
+
write_files(@output_dir, collections)
|
|
30
|
+
collections
|
|
31
|
+
end
|
|
32
|
+
|
|
33
|
+
# Returns an array of `MongodbCollectionConfig`, one per *collection*
|
|
34
|
+
# (top-level collections and embedded subdocument configs alike).
|
|
35
|
+
#
|
|
36
|
+
# Models are grouped by `collection_name` so an inheritance hierarchy whose
|
|
37
|
+
# subclasses share the base's collection (Mongoid STI, discriminated by the
|
|
38
|
+
# auto-added `_type` field) collapses into a single config that aggregates
|
|
39
|
+
# every class's fields and associations. See `expand_with_descendants`.
|
|
40
|
+
def build_collections
|
|
41
|
+
models = expand_with_descendants(concrete_models)
|
|
42
|
+
models
|
|
43
|
+
.group_by { |model| model.collection_name.to_s }
|
|
44
|
+
.map { |collection_name, group| build_collection_for(collection_name, group) }
|
|
45
|
+
end
|
|
46
|
+
|
|
47
|
+
def write_files(dir, collections)
|
|
48
|
+
FileUtils.mkdir_p(dir)
|
|
49
|
+
|
|
50
|
+
collections.each do |collection|
|
|
51
|
+
path = File.join(dir, "#{collection.name}.json")
|
|
52
|
+
config_to_write =
|
|
53
|
+
if File.exist?(path)
|
|
54
|
+
MongodbCollectionConfig.from(JSON.parse(File.read(path))).merge(collection)
|
|
55
|
+
else
|
|
56
|
+
collection
|
|
57
|
+
end
|
|
58
|
+
File.write(path, JSON.pretty_generate(config_to_write.to_hash) + "\n")
|
|
59
|
+
end
|
|
60
|
+
end
|
|
61
|
+
|
|
62
|
+
# Builds one config for the collection shared by `models` (usually a single
|
|
63
|
+
# model, but an inheritance hierarchy contributes several). Fields and
|
|
64
|
+
# belongs_tos are unioned across the group; processing least-derived first
|
|
65
|
+
# keeps the base's fields leading the list and the output deterministic
|
|
66
|
+
# regardless of input order or sibling subclasses.
|
|
67
|
+
private def build_collection_for(collection_name, models)
|
|
68
|
+
ordered = models.sort_by { |model| [model.fields.size, model.name] }
|
|
69
|
+
|
|
70
|
+
attrs = {
|
|
71
|
+
name: collection_name,
|
|
72
|
+
primary_key: "_id",
|
|
73
|
+
fields: aggregate_fields(ordered),
|
|
74
|
+
}
|
|
75
|
+
|
|
76
|
+
if ordered.any?(&:embedded?)
|
|
77
|
+
# Cross-collection references from inside an embedded array are not
|
|
78
|
+
# supported (MongodbCollectionConfig rejects them), so embedded configs
|
|
79
|
+
# always carry an empty belongs_tos and instead declare where they live.
|
|
80
|
+
attrs[:belongs_tos] = []
|
|
81
|
+
attrs[:embedded_in] = embedded_in_for(ordered.find(&:embedded?))
|
|
82
|
+
else
|
|
83
|
+
attrs[:belongs_tos] = aggregate_belongs_tos(ordered)
|
|
84
|
+
end
|
|
85
|
+
|
|
86
|
+
MongodbCollectionConfig.from_symbol_keys(attrs)
|
|
87
|
+
end
|
|
88
|
+
|
|
89
|
+
# Mongoid registers only the base class of an inheritance hierarchy in
|
|
90
|
+
# `Mongoid.models`; subclasses that store into the base's collection
|
|
91
|
+
# (STI-style, distinguished by the auto-added `_type` discriminator) are
|
|
92
|
+
# reachable only via `descendants`, and the base's own metadata does NOT
|
|
93
|
+
# include subclass-only fields or associations. Expand the model set with
|
|
94
|
+
# descendants so each collection's config aggregates every class that
|
|
95
|
+
# stores into it. (A subclass that overrides `store_in` to a different
|
|
96
|
+
# collection naturally falls into its own group via the `collection_name`
|
|
97
|
+
# grouping in `build_collections`.)
|
|
98
|
+
private def expand_with_descendants(models)
|
|
99
|
+
concrete((models + models.flat_map(&:descendants)).uniq)
|
|
100
|
+
end
|
|
101
|
+
|
|
102
|
+
# Mongoid registers internal helper classes (e.g. the discriminator key
|
|
103
|
+
# host) in `Mongoid.models`; those have no usable `collection_name`. Keep
|
|
104
|
+
# only application documents.
|
|
105
|
+
private def concrete_models
|
|
106
|
+
concrete(@models)
|
|
107
|
+
end
|
|
108
|
+
|
|
109
|
+
private def concrete(models)
|
|
110
|
+
models.select do |model|
|
|
111
|
+
model.respond_to?(:collection_name) &&
|
|
112
|
+
model.name &&
|
|
113
|
+
!model.name.start_with?("Mongoid::")
|
|
114
|
+
end
|
|
115
|
+
end
|
|
116
|
+
|
|
117
|
+
# Unions the declared field names across `models`, preserving first-seen
|
|
118
|
+
# order. A subclass's `fields` already includes everything it inherits, so
|
|
119
|
+
# the base's fields lead and each subclass appends only its own.
|
|
120
|
+
private def aggregate_fields(models)
|
|
121
|
+
seen = {}
|
|
122
|
+
models.each_with_object([]) do |model, fields|
|
|
123
|
+
accessor_by_storage = aliased_field_accessors(model)
|
|
124
|
+
model.fields.keys.each do |name|
|
|
125
|
+
next if seen[name]
|
|
126
|
+
|
|
127
|
+
seen[name] = true
|
|
128
|
+
field = { name: name }
|
|
129
|
+
# When `field :ctry, as: :country` renamed the storage key, surface the
|
|
130
|
+
# Ruby accessor so the short key is not cryptic in the config.
|
|
131
|
+
accessor = accessor_by_storage[name]
|
|
132
|
+
field[:mongoid_field_name] = accessor if accessor
|
|
133
|
+
fields << field
|
|
134
|
+
end
|
|
135
|
+
end
|
|
136
|
+
end
|
|
137
|
+
|
|
138
|
+
# Maps a stored document key -> its Mongoid Ruby accessor, but ONLY for
|
|
139
|
+
# genuine `field ..., as:` renames. `Model.aliased_fields` also contains the
|
|
140
|
+
# built-in `id => _id` and one entry per association (e.g. `shop => shop_id`,
|
|
141
|
+
# `profile => user_profile`); those are not field renames, so exclude any
|
|
142
|
+
# accessor that names a relation, the `_id` storage key, or a no-op alias.
|
|
143
|
+
private def aliased_field_accessors(model)
|
|
144
|
+
relation_names = model.relations.keys
|
|
145
|
+
model.aliased_fields.each_with_object({}) do |(accessor, storage), acc|
|
|
146
|
+
next if accessor == storage
|
|
147
|
+
next if storage == "_id"
|
|
148
|
+
next if relation_names.include?(accessor)
|
|
149
|
+
next unless model.fields.key?(storage)
|
|
150
|
+
|
|
151
|
+
acc[storage] = accessor
|
|
152
|
+
end
|
|
153
|
+
end
|
|
154
|
+
|
|
155
|
+
private def aggregate_belongs_tos(models)
|
|
156
|
+
belongs_to_assocs = models.flat_map do |model|
|
|
157
|
+
model.relations.values.select do |assoc|
|
|
158
|
+
assoc.is_a?(::Mongoid::Association::Referenced::BelongsTo)
|
|
159
|
+
end
|
|
160
|
+
end
|
|
161
|
+
|
|
162
|
+
# polymorphic belongs_to (`belongs_to :reviewable, polymorphic: true`) は
|
|
163
|
+
# 単一の対象コレクションを持たないため現状未対応。誤った FK を出力しないよう
|
|
164
|
+
# ここでは除外する (将来 ActiveRecord 版と同様に展開する余地を残す)。
|
|
165
|
+
#
|
|
166
|
+
# 継承階層では基底クラスとサブクラスが同じ belongs_to を二重に持つため uniq する。
|
|
167
|
+
belongs_to_assocs
|
|
168
|
+
.reject(&:polymorphic?)
|
|
169
|
+
.map { |assoc| { table_name: assoc.klass.collection_name.to_s, foreign_key: assoc.foreign_key } }
|
|
170
|
+
.uniq
|
|
171
|
+
end
|
|
172
|
+
|
|
173
|
+
# Resolves the `embedded_in` config for an embedded model. Each embedded
|
|
174
|
+
# model points at its *immediate* embedding parent: the parent's collection
|
|
175
|
+
# name plus the single document key (`store_as`, defaulting to the relation
|
|
176
|
+
# name) the subdocuments live under within that parent.
|
|
177
|
+
#
|
|
178
|
+
# Multi-level nesting is represented one link at a time, NOT flattened into
|
|
179
|
+
# a dot-separated path. For `User embeds_many :posts` and
|
|
180
|
+
# `Post embeds_many :comments`, the Post config resolves to
|
|
181
|
+
# `{ collection_name: "users", path: "posts" }` and the Comment config to
|
|
182
|
+
# `{ collection_name: "posts", path: "comments" }`. `MongodbAdapter` walks
|
|
183
|
+
# this chain recursively (masking each `posts` subdocument, then its
|
|
184
|
+
# `comments`), which is the only form that correctly traverses both array
|
|
185
|
+
# (`embeds_many`) and Hash (`embeds_one`) intermediates — a flattened
|
|
186
|
+
# `posts.comments` path would stop at the `posts` array boundary.
|
|
187
|
+
private def embedded_in_for(model)
|
|
188
|
+
assoc = embedded_in_association(model)
|
|
189
|
+
|
|
190
|
+
# A polymorphic `embedded_in` (`embedded_in :addressable, polymorphic: true`)
|
|
191
|
+
# can live inside several different parent collections, so it has no single
|
|
192
|
+
# embedding parent and `assoc.klass` would raise a cryptic NameError
|
|
193
|
+
# (uninitialized constant) trying to resolve one. exwiw's `embedded_in`
|
|
194
|
+
# names exactly one parent collection + path, so this shape cannot be
|
|
195
|
+
# represented; fail loudly with an actionable message instead of crashing.
|
|
196
|
+
if assoc.polymorphic?
|
|
197
|
+
raise ArgumentError,
|
|
198
|
+
"MongoidSchemaGenerator: '#{model.name}' (collection '#{model.collection_name}') " \
|
|
199
|
+
"declares a polymorphic `embedded_in :#{assoc.name}`, which has no single embedding " \
|
|
200
|
+
"parent collection and cannot be expressed as an exwiw `embedded_in` config. " \
|
|
201
|
+
"Define the collection's config by hand, or make the relation non-polymorphic."
|
|
202
|
+
end
|
|
203
|
+
|
|
204
|
+
parent = assoc.klass
|
|
205
|
+
|
|
206
|
+
# A self-referential / cyclic `embedded_in` — Mongoid's
|
|
207
|
+
# `recursively_embeds_many` / `recursively_embeds_one` (which declare a
|
|
208
|
+
# `cyclic: true` `embedded_in`/`embeds_*` pair pointing at the same model),
|
|
209
|
+
# or any hand-rolled self-embedding — makes a collection BOTH a top-level
|
|
210
|
+
# document AND embedded inside documents of its own type. exwiw represents
|
|
211
|
+
# a collection as either top-level (dumpable on its own) or embedded
|
|
212
|
+
# (masked through its parent at `path`), never both: emitting an
|
|
213
|
+
# `embedded_in` here would mark the whole collection embedded, so
|
|
214
|
+
# `MongodbAdapter#dumpable?` (`!embedded?`) would silently never dump the
|
|
215
|
+
# collection's root documents. Fail loudly instead.
|
|
216
|
+
if parent.collection_name.to_s == model.collection_name.to_s
|
|
217
|
+
raise ArgumentError,
|
|
218
|
+
"MongoidSchemaGenerator: '#{model.name}' (collection '#{model.collection_name}') " \
|
|
219
|
+
"declares a self-referential (cyclic) `embedded_in :#{assoc.name}` that embeds the " \
|
|
220
|
+
"collection inside documents of its own type (e.g. `recursively_embeds_many` / " \
|
|
221
|
+
"`recursively_embeds_one`). " \
|
|
222
|
+
"exwiw represents a collection as either top-level or embedded, not both, so this " \
|
|
223
|
+
"cannot be expressed as an exwiw `embedded_in` config. Define the collection's config " \
|
|
224
|
+
"by hand."
|
|
225
|
+
end
|
|
226
|
+
|
|
227
|
+
# `store_as` defaults to the relation name and is the actual document key
|
|
228
|
+
# the subdocuments are stored under inside the immediate parent.
|
|
229
|
+
parent_relation = parent.relations[assoc.inverse.to_s]
|
|
230
|
+
|
|
231
|
+
{ collection_name: parent.collection_name.to_s, path: parent_relation.store_as }
|
|
232
|
+
end
|
|
233
|
+
|
|
234
|
+
private def embedded_in_association(model)
|
|
235
|
+
model.relations.values.find do |assoc|
|
|
236
|
+
assoc.is_a?(::Mongoid::Association::Embedded::EmbeddedIn)
|
|
237
|
+
end
|
|
238
|
+
end
|
|
239
|
+
end
|
|
240
|
+
end
|
|
@@ -107,6 +107,10 @@ module Exwiw
|
|
|
107
107
|
clauses = []
|
|
108
108
|
|
|
109
109
|
if table.name == dump_target.table_name
|
|
110
|
+
# TODO: honor dump_target.ids_field here so `--ids` can match a non
|
|
111
|
+
# primary-key column on the target table (currently mongodb-only; the
|
|
112
|
+
# CLI rejects --ids-field for the sql adapters). When implemented, use
|
|
113
|
+
# `dump_target.ids_field || table.primary_key` as the column_name.
|
|
110
114
|
clauses.push Exwiw::QueryAst::WhereClause.new(
|
|
111
115
|
column_name: table.primary_key,
|
|
112
116
|
operator: :eq,
|
data/lib/exwiw/version.rb
CHANGED
data/lib/exwiw.rb
CHANGED
|
@@ -25,6 +25,7 @@ require_relative "exwiw/after_insert_hook"
|
|
|
25
25
|
require_relative "exwiw/runner"
|
|
26
26
|
require_relative "exwiw/explain_runner"
|
|
27
27
|
require_relative "exwiw/schema_generator"
|
|
28
|
+
require_relative "exwiw/mongoid_schema_generator"
|
|
28
29
|
|
|
29
30
|
begin
|
|
30
31
|
require 'rails'
|
|
@@ -34,6 +35,9 @@ else
|
|
|
34
35
|
end
|
|
35
36
|
|
|
36
37
|
module Exwiw
|
|
37
|
-
|
|
38
|
+
# `ids_field` optionally overrides which field `--ids` is matched against on
|
|
39
|
+
# the target table. When nil the table's primary key is used (the historical
|
|
40
|
+
# behavior). Currently only honored by the mongodb adapter.
|
|
41
|
+
DumpTarget = Struct.new(:table_name, :ids, :ids_field, keyword_init: true)
|
|
38
42
|
ConnectionConfig = Struct.new(:adapter, :host, :port, :user, :password, :database_name, keyword_init: true)
|
|
39
43
|
end
|
data/lib/tasks/exwiw.rake
CHANGED
|
@@ -10,5 +10,14 @@ namespace :exwiw do
|
|
|
10
10
|
output_dir: ENV["OUTPUT_DIR_PATH"] || "exwiw",
|
|
11
11
|
).generate!
|
|
12
12
|
end
|
|
13
|
+
|
|
14
|
+
desc "Generate schema from a Mongoid application"
|
|
15
|
+
task generate_mongoid: :environment do
|
|
16
|
+
require "exwiw"
|
|
17
|
+
|
|
18
|
+
Exwiw::MongoidSchemaGenerator.from_rails_application(
|
|
19
|
+
output_dir: ENV["OUTPUT_DIR_PATH"] || "exwiw",
|
|
20
|
+
).generate!
|
|
21
|
+
end
|
|
13
22
|
end
|
|
14
23
|
end
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: exwiw
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.2.
|
|
4
|
+
version: 0.2.9
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Shia
|
|
@@ -57,6 +57,7 @@ files:
|
|
|
57
57
|
- lib/exwiw/mongo_query.rb
|
|
58
58
|
- lib/exwiw/mongodb_collection_config.rb
|
|
59
59
|
- lib/exwiw/mongodb_field.rb
|
|
60
|
+
- lib/exwiw/mongoid_schema_generator.rb
|
|
60
61
|
- lib/exwiw/query_ast.rb
|
|
61
62
|
- lib/exwiw/query_ast_builder.rb
|
|
62
63
|
- lib/exwiw/railtie.rb
|