exwiw 0.8.1 → 0.8.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +12 -0
- data/README.md +48 -2
- data/lib/exwiw/adapter/mysql_adapter.rb +8 -0
- data/lib/exwiw/adapter/postgresql_adapter.rb +33 -0
- data/lib/exwiw/adapter/sqlite_adapter.rb +8 -0
- data/lib/exwiw/query_ast.rb +20 -1
- data/lib/exwiw/query_ast_builder.rb +118 -25
- data/lib/exwiw/reverse_scope.rb +47 -0
- data/lib/exwiw/table_config.rb +14 -0
- data/lib/exwiw/version.rb +1 -1
- data/lib/exwiw.rb +1 -0
- metadata +2 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 90d949cb54565ed644b599102cfabc16236fe2d83523594cd7b14f269118a9b2
|
|
4
|
+
data.tar.gz: 7236bb6ee6b9dda38aec93491f0266e334890e3c271b4f7d5d072d46a1e0f88b
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: bcfac0aaf220b55dfa3172f94d2ad0a6e828f53253e367aa9203db7061695fa6a9188e22a92c18d51bd9a55b6a9bf840cf780a3647d4aca2a0cccff1a785a347
|
|
7
|
+
data.tar.gz: d891fc7101fea30d674b933213715a8a4534c459e191b9bf4fce31b3bda8a9be037bf09627419f43ef914d2a9288aa40e1cd4836a4d8e1e0901d73e320218cd3
|
data/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,18 @@
|
|
|
2
2
|
|
|
3
3
|
## [Unreleased]
|
|
4
4
|
|
|
5
|
+
## [0.8.3] - 2026-06-24
|
|
6
|
+
|
|
7
|
+
### Fixed
|
|
8
|
+
|
|
9
|
+
- **Forward scope (`via_scoped_parent`) now cascades across multiple `belongs_to` hops instead of dying after one.** A table with no scope column of its own is scoped by constraining it to its `belongs_to` parent's in-scope ids (`fk IN (SELECT parent.pk FROM <parent's scoped query>)`). Previously the parent was rebuilt with forward scoping turned off, so if the parent was *itself* scoped only through *its* parent (e.g. an identity-family table two or more hops below a `reverse_scope`/`referenced_by` table — `users ← end_users ← end_user_profiles`), the rebuilt parent came back unconstrained and the child was classified `:unscopable` — forcing a `scope_exempt` full dump and re-introducing the bloat the prune removes. The boolean single-hop bound is replaced by a forward-path guard: the rescue keeps forward scoping enabled while rebuilding the parent (appending the current table to the path), so the cascade recurses N levels and produces a correspondingly nested `IN (subquery)`; it terminates only on a genuine `belongs_to` cycle (a table already on the path is not revisited, falling through to `:unscopable`). The single-unambiguous-parent rule and the polymorphic-skip are unchanged, and the reverse arms still cannot loop back through the table being reverse-scoped. SQL adapters only.
|
|
10
|
+
|
|
11
|
+
## [0.8.2] - 2026-06-24
|
|
12
|
+
|
|
13
|
+
### Added
|
|
14
|
+
|
|
15
|
+
- **Multi-referencer reverse scoping (`reverse_scope`).** Reverse / "referenced_by" extraction previously narrowed only a table referenced by *exactly one* constrained child; a table referenced by two or more (most importantly a **global-identity table** like `users`, which carries no scope/tenant column and has no `belongs_to` of its own, yet many scoped tables point *at* it) fell back to dumping every row — dragging in every tenant's identities. A table can now opt into **multi-referencer** reverse scoping with a user-owned `reverse_scope: { via: [{ table, column }, …] }` key listing the referencers whose own (already scoped) extraction queries are `UNION`'d into the id set it is constrained to (`pk IN (SELECT ref1.col1 FROM ref1 <scope> UNION SELECT ref2.col2 FROM ref2 <scope> …)`). Each arm reuses that referencer's own scope, so a per-tenant run keeps only that tenant's ids; the named `column` is explicit, so a non-default foreign key (or a column with no declared `belongs_to`) is honored; NULLs are excluded per arm. Only **scoped** referencers belong in `via` — an unconstrained arm (e.g. a `scope_exempt` referencer) would union every row back, so it is skipped with a warning rather than silently widening the dump, and an unknown table is likewise skipped; if no arm survives, the table stays unscopable (so scope-column mode's `validate_scope!` still aborts rather than dumping it in full). Tables that `belongs_to` the reverse-scoped table tighten automatically through the existing cascade and need no config. `reverse_scope` is never emitted by `schema:generate` and is preserved across regeneration like `scope_exempt`/`scope_column`. SQL adapters only. See the [README](README.md#reverse-scope-for-multi-referencer-tables-reverse_scope).
|
|
16
|
+
|
|
5
17
|
## [0.8.1] - 2026-06-24
|
|
6
18
|
|
|
7
19
|
### Fixed
|
data/README.md
CHANGED
|
@@ -183,8 +183,11 @@ Each table is resolved as follows:
|
|
|
183
183
|
(`fk IN (SELECT parent.pk FROM <parent's scoped query>)`). This covers a *hub*
|
|
184
184
|
table that has no scope column and is scoped only because an extractable child
|
|
185
185
|
references it (see referenced-by below): the hub's other `belongs_to` children
|
|
186
|
-
ride along to just the in-scope rows instead of being dumped in full.
|
|
187
|
-
|
|
186
|
+
ride along to just the in-scope rows instead of being dumped in full. The parent
|
|
187
|
+
itself may be scoped the same way, so this **cascades across multiple hops**
|
|
188
|
+
(each a single unambiguous scopable parent) and the subquery nests
|
|
189
|
+
correspondingly; the recursion terminates on a genuine `belongs_to` cycle (a
|
|
190
|
+
table already on the path is left `:unscopable` rather than looped on).
|
|
188
191
|
- **Cannot be scoped at all** (no scope column and no path to one) → exwiw
|
|
189
192
|
**aborts** and lists the offending tables, so an unscoped table is never silently
|
|
190
193
|
dumped in full. For each, either declare a `scope_column`, add a `belongs_to`
|
|
@@ -579,6 +582,49 @@ ActiveStorage is handled automatically — no ActiveStorage-specific configurati
|
|
|
579
582
|
`active_storage_variant_records` also references blobs, but since it has no path of its own to the dump target it doesn't constrain anything and is ignored as a referencer — blobs stays narrowed to the attachment-referenced ids. (A parent referenced by *multiple* constrained children currently falls back to dumping all of its rows.)
|
|
580
583
|
- **`active_storage_variant_records`** holds derivative variant-tracking rows that ActiveStorage regenerates lazily, and it too has no path to the dump target — left alone it would land in the "no relation → dump all" branch and, worse, its `blob_id` could point at blobs outside the narrowed set above (a foreign-key violation on import). `exwiw:schema:generate` therefore emits it with **`ignore: true`** (and drops it from the attachments `record` polymorphic expansion so nothing carries a dangling reference to it), so its data is skipped while the DDL is still written. Remove `ignore` from the generated config if you really need to export it.
|
|
581
584
|
|
|
585
|
+
### Reverse scope for multi-referencer tables (`reverse_scope`)
|
|
586
|
+
|
|
587
|
+
The automatic reverse extraction above narrows a table referenced by **exactly one** constrained child. A table referenced by **two or more** constrained children falls back to dumping every row — fine for `active_storage_blobs`, but a problem for a **global-identity table** such as `users`: it carries no scope/tenant column and has no `belongs_to` of its own, yet dozens of scoped tables point *at* it. Dumping it (and everything that hangs off it) in full pulls in every tenant's identities.
|
|
588
|
+
|
|
589
|
+
`reverse_scope` opts such a table into **multi-referencer** reverse scoping: you enumerate the referencers whose own (already scoped) extraction queries should be `UNION`'d into the id set the table is constrained to. It is a user-owned key (never emitted by `schema:generate`, preserved across regeneration like `scope_exempt`/`scope_column`):
|
|
590
|
+
|
|
591
|
+
```json
|
|
592
|
+
{
|
|
593
|
+
"name": "users",
|
|
594
|
+
"primary_key": "id",
|
|
595
|
+
"reverse_scope": {
|
|
596
|
+
"via": [
|
|
597
|
+
{ "table": "customers", "column": "user_id" },
|
|
598
|
+
{ "table": "staff", "column": "user_id" },
|
|
599
|
+
{ "table": "business_entity_customers", "column": "kantan_yoyaku_user_id" }
|
|
600
|
+
]
|
|
601
|
+
},
|
|
602
|
+
"columns": [{ "name": "id" }, { "name": "name" }]
|
|
603
|
+
}
|
|
604
|
+
```
|
|
605
|
+
|
|
606
|
+
produces (each arm reuses that referencer's own scope, so a per-tenant run keeps only that tenant's ids):
|
|
607
|
+
|
|
608
|
+
```sql
|
|
609
|
+
SELECT users.* FROM users
|
|
610
|
+
WHERE users.id IN (
|
|
611
|
+
SELECT customers.user_id FROM customers WHERE <customers' scope> AND customers.user_id IS NOT NULL
|
|
612
|
+
UNION
|
|
613
|
+
SELECT staff.user_id FROM staff WHERE <staff' scope> AND staff.user_id IS NOT NULL
|
|
614
|
+
UNION
|
|
615
|
+
SELECT business_entity_customers.kantan_yoyaku_user_id FROM business_entity_customers
|
|
616
|
+
WHERE <…' scope> AND business_entity_customers.kantan_yoyaku_user_id IS NOT NULL
|
|
617
|
+
)
|
|
618
|
+
```
|
|
619
|
+
|
|
620
|
+
Notes:
|
|
621
|
+
|
|
622
|
+
- **`column` is explicit**, so a *non-default* foreign key (e.g. `kantan_yoyaku_user_id`, or `organization_admins.id` which itself references `users.id`) is honored, and even a column with no declared `belongs_to` edge can be enumerated.
|
|
623
|
+
- **Only scoped referencers belong in `via`.** Each arm's query must come out constrained; an unconstrained referencer (e.g. a `scope_exempt` table, or one with no path to a scope) would project *every* id and union the whole table back — so such an arm is **skipped with a warning** rather than silently widening the dump. An unknown table is likewise skipped with a warning. If no arm survives, the table stays unscopable and (in [scope-column mode](#scope-column-mode)) the run aborts via `validate_scope!`.
|
|
624
|
+
- **NULLs are excluded** per arm (`IS NOT NULL`).
|
|
625
|
+
- **Satellites need no config.** A table that `belongs_to` the reverse-scoped table (e.g. `end_users.id → users.id`, or `identities.user_id → users.id`) tightens to the kept ids automatically through the normal cascade — only the reverse-scoped table itself declares `reverse_scope`. The cascade is **multi-hop**, so a table several `belongs_to` hops below the reverse-scoped table (e.g. `end_user_profiles → end_users → users`) also tightens automatically, with no config of its own.
|
|
626
|
+
- Works in both single-target and scope-column mode. Polymorphic foreign keys are not eligible as anchors (the named `column` is always a concrete column).
|
|
627
|
+
|
|
582
628
|
### Rails-managed tables (special `type` values)
|
|
583
629
|
|
|
584
630
|
Some tables are owned by Rails itself rather than the application — they have no ActiveRecord model and Rails reserves the right to evolve their column shape between versions (e.g. `schema_migrations`, `ar_internal_metadata`). exwiw treats them as a distinct category via the `type` field on a table config:
|
|
@@ -280,6 +280,8 @@ module Exwiw
|
|
|
280
280
|
end
|
|
281
281
|
elsif where_clause.operator == :in_subquery
|
|
282
282
|
"#{key} IN (#{compile_subquery(where_clause.value)})"
|
|
283
|
+
elsif where_clause.operator == :not_null
|
|
284
|
+
"#{key} IS NOT NULL"
|
|
283
285
|
else
|
|
284
286
|
raise "Unsupported operator: #{where_clause.operator}"
|
|
285
287
|
end
|
|
@@ -290,6 +292,12 @@ module Exwiw
|
|
|
290
292
|
# extraction query, projected to a foreign key); compile it as-is.
|
|
291
293
|
return compile_ast(subquery.query) if subquery.is_a?(Exwiw::QueryAst::SelectSubquery)
|
|
292
294
|
|
|
295
|
+
# A UnionSubquery wraps several such Selects; UNION their compiled forms
|
|
296
|
+
# into a single id set.
|
|
297
|
+
if subquery.is_a?(Exwiw::QueryAst::UnionSubquery)
|
|
298
|
+
return subquery.queries.map { |q| compile_ast(q) }.join(' UNION ')
|
|
299
|
+
end
|
|
300
|
+
|
|
293
301
|
inner_values = subquery.where_values.map { |v| escape_value(v) }
|
|
294
302
|
"SELECT #{subquery.table_name}.#{subquery.select_column} " \
|
|
295
303
|
"FROM #{subquery.table_name} " \
|
|
@@ -364,6 +364,8 @@ module Exwiw
|
|
|
364
364
|
cast_to = subquery_cast_to(where_clause.value, table_name, where_clause.column_name)
|
|
365
365
|
outer_key = cast_to ? "#{key}::#{cast_to}" : key
|
|
366
366
|
"#{outer_key} IN (#{subquery_sql})"
|
|
367
|
+
elsif where_clause.operator == :not_null
|
|
368
|
+
"#{key} IS NOT NULL"
|
|
367
369
|
else
|
|
368
370
|
raise "Unsupported operator: #{where_clause.operator}"
|
|
369
371
|
end
|
|
@@ -376,6 +378,15 @@ module Exwiw
|
|
|
376
378
|
return compile_ast(subquery.query, select_cast_to: cast_to)
|
|
377
379
|
end
|
|
378
380
|
|
|
381
|
+
# A UnionSubquery wraps several projected Selects; UNION their compiled
|
|
382
|
+
# forms. cast_to is the union-wide decision (see union_cast_to): when any
|
|
383
|
+
# arm's column type would clash with the outer column or another arm,
|
|
384
|
+
# every arm's projected column and the outer key are cast to text so the
|
|
385
|
+
# UNION and the enclosing IN comparison resolve to one type.
|
|
386
|
+
if subquery.is_a?(Exwiw::QueryAst::UnionSubquery)
|
|
387
|
+
return subquery.queries.map { |q| compile_ast(q, select_cast_to: cast_to) }.join(' UNION ')
|
|
388
|
+
end
|
|
389
|
+
|
|
379
390
|
inner_values = subquery.where_values.map { |v| escape_value(v) }
|
|
380
391
|
select_expr = "#{subquery.table_name}.#{subquery.select_column}"
|
|
381
392
|
select_expr = "#{select_expr}::#{cast_to}" if cast_to
|
|
@@ -400,6 +411,11 @@ module Exwiw
|
|
|
400
411
|
private def subquery_cast_to(subquery, outer_table, outer_column)
|
|
401
412
|
return nil if outer_table.nil? || outer_column.nil?
|
|
402
413
|
|
|
414
|
+
# A UNION's arms (and the enclosing IN comparison) must all resolve to
|
|
415
|
+
# one type, so the cast decision must weigh every arm — not just one, as
|
|
416
|
+
# a flat Subquery would.
|
|
417
|
+
return union_cast_to(subquery, outer_table, outer_column) if subquery.is_a?(Exwiw::QueryAst::UnionSubquery)
|
|
418
|
+
|
|
403
419
|
inner_table, inner_column = subquery_select_target(subquery)
|
|
404
420
|
return nil if inner_table.nil?
|
|
405
421
|
|
|
@@ -408,6 +424,23 @@ module Exwiw
|
|
|
408
424
|
types_need_cast?(outer_type, inner_type) ? 'text' : nil
|
|
409
425
|
end
|
|
410
426
|
|
|
427
|
+
# Postgres rejects a UNION (or an `IN`) that mixes incompatible types
|
|
428
|
+
# (e.g. uuid and varchar). Examining only the first arm is not enough: a
|
|
429
|
+
# heterogeneous later arm would go uncast and break at execution. So
|
|
430
|
+
# consider the outer column together with every arm's projected column and,
|
|
431
|
+
# if ANY pair needs reconciliation, cast them all to text.
|
|
432
|
+
private def union_cast_to(union, outer_table, outer_column)
|
|
433
|
+
types = [column_pg_type(outer_table, outer_column)]
|
|
434
|
+
union.queries.each do |q|
|
|
435
|
+
col = q.columns.first
|
|
436
|
+
types << column_pg_type(q.from_table_name, col.name) if col
|
|
437
|
+
end
|
|
438
|
+
types.compact!
|
|
439
|
+
|
|
440
|
+
needs_cast = types.combination(2).any? { |a, b| types_need_cast?(a, b) }
|
|
441
|
+
needs_cast ? 'text' : nil
|
|
442
|
+
end
|
|
443
|
+
|
|
411
444
|
private def escape_value(value)
|
|
412
445
|
case value
|
|
413
446
|
when nil
|
|
@@ -249,6 +249,8 @@ module Exwiw
|
|
|
249
249
|
end
|
|
250
250
|
elsif where_clause.operator == :in_subquery
|
|
251
251
|
"#{key} IN (#{compile_subquery(where_clause.value)})"
|
|
252
|
+
elsif where_clause.operator == :not_null
|
|
253
|
+
"#{key} IS NOT NULL"
|
|
252
254
|
else
|
|
253
255
|
raise "Unsupported operator: #{where_clause.operator}"
|
|
254
256
|
end
|
|
@@ -259,6 +261,12 @@ module Exwiw
|
|
|
259
261
|
# extraction query, projected to a foreign key); compile it as-is.
|
|
260
262
|
return compile_ast(subquery.query) if subquery.is_a?(Exwiw::QueryAst::SelectSubquery)
|
|
261
263
|
|
|
264
|
+
# A UnionSubquery wraps several such Selects; UNION their compiled forms
|
|
265
|
+
# into a single id set.
|
|
266
|
+
if subquery.is_a?(Exwiw::QueryAst::UnionSubquery)
|
|
267
|
+
return subquery.queries.map { |q| compile_ast(q) }.join(' UNION ')
|
|
268
|
+
end
|
|
269
|
+
|
|
262
270
|
inner_values = subquery.where_values.map { |v| escape_value(v) }
|
|
263
271
|
"SELECT #{subquery.table_name}.#{subquery.select_column} " \
|
|
264
272
|
"FROM #{subquery.table_name} " \
|
data/lib/exwiw/query_ast.rb
CHANGED
|
@@ -41,7 +41,7 @@ module Exwiw
|
|
|
41
41
|
{
|
|
42
42
|
column_name: column_name,
|
|
43
43
|
operator: operator,
|
|
44
|
-
value: value.is_a?(Subquery) || value.is_a?(SelectSubquery) ? value.to_h : value,
|
|
44
|
+
value: value.is_a?(Subquery) || value.is_a?(SelectSubquery) || value.is_a?(UnionSubquery) ? value.to_h : value,
|
|
45
45
|
}
|
|
46
46
|
end
|
|
47
47
|
end
|
|
@@ -83,6 +83,25 @@ module Exwiw
|
|
|
83
83
|
end
|
|
84
84
|
end
|
|
85
85
|
|
|
86
|
+
# A subquery that UNIONs several single-column `Select`s into one id set.
|
|
87
|
+
# Used by *multi-referencer* reverse / "referenced_by" extraction
|
|
88
|
+
# (TableConfig#reverse_scope): a parent table referenced by many scoped
|
|
89
|
+
# tables is constrained to the union of every referencer's projected foreign
|
|
90
|
+
# key, rather than falling back to dumping every row:
|
|
91
|
+
#
|
|
92
|
+
# <parent>.<pk> IN (
|
|
93
|
+
# SELECT <ref1>.<col1> FROM <ref1> WHERE <ref1 scope> AND <col1> IS NOT NULL
|
|
94
|
+
# UNION SELECT <ref2>.<col2> FROM <ref2> WHERE <ref2 scope> AND <col2> IS NOT NULL
|
|
95
|
+
# )
|
|
96
|
+
#
|
|
97
|
+
# Each `queries` element is a `Select` already projected to the foreign-key
|
|
98
|
+
# column that points at the parent (with a NULL-excluding filter).
|
|
99
|
+
UnionSubquery = Struct.new(:queries, keyword_init: true) do
|
|
100
|
+
def to_h
|
|
101
|
+
{ union: queries.map(&:to_h) }
|
|
102
|
+
end
|
|
103
|
+
end
|
|
104
|
+
|
|
86
105
|
module ColumnValue
|
|
87
106
|
Base = Struct.new(:name, :value, keyword_init: true)
|
|
88
107
|
Plain = Class.new(Base)
|
|
@@ -2,8 +2,8 @@
|
|
|
2
2
|
|
|
3
3
|
module Exwiw
|
|
4
4
|
class QueryAstBuilder
|
|
5
|
-
def self.run(table_name, table_by_name, dump_target, logger, allow_reverse: true,
|
|
6
|
-
new(table_name, table_by_name, dump_target, logger, allow_reverse: allow_reverse,
|
|
5
|
+
def self.run(table_name, table_by_name, dump_target, logger, allow_reverse: true, forward_path: [])
|
|
6
|
+
new(table_name, table_by_name, dump_target, logger, allow_reverse: allow_reverse, forward_path: forward_path).run
|
|
7
7
|
end
|
|
8
8
|
|
|
9
9
|
# Scope-column mode classification for a single table. One of
|
|
@@ -49,17 +49,20 @@ module Exwiw
|
|
|
49
49
|
|
|
50
50
|
attr_reader :table_name, :table_by_name, :dump_target
|
|
51
51
|
|
|
52
|
-
def initialize(table_name, table_by_name, dump_target, logger, allow_reverse: true,
|
|
52
|
+
def initialize(table_name, table_by_name, dump_target, logger, allow_reverse: true, forward_path: [])
|
|
53
53
|
@table_name = table_name
|
|
54
54
|
@table_by_name = table_by_name
|
|
55
55
|
@dump_target = dump_target
|
|
56
56
|
@logger = logger
|
|
57
57
|
@allow_reverse = allow_reverse
|
|
58
|
-
# @
|
|
59
|
-
#
|
|
60
|
-
#
|
|
61
|
-
#
|
|
62
|
-
|
|
58
|
+
# @forward_path is the chain of tables currently being forward-resolved by
|
|
59
|
+
# the "scope via an indirectly-scoped belongs_to parent" rescue
|
|
60
|
+
# (build_belongs_to_scoped_clause). Each forward hop appends the table it is
|
|
61
|
+
# descending from, so the rescue recurses N levels (users -> end_users ->
|
|
62
|
+
# end_user_profiles -> ...) and stops only on a real belongs_to cycle: a
|
|
63
|
+
# table already on the path is not re-resolved, falling through to
|
|
64
|
+
# :unscopable instead of looping forever.
|
|
65
|
+
@forward_path = forward_path
|
|
63
66
|
end
|
|
64
67
|
|
|
65
68
|
def run
|
|
@@ -168,6 +171,15 @@ module Exwiw
|
|
|
168
171
|
# such (single, unambiguous) referencer, leaving the caller to fall back to
|
|
169
172
|
# the dump-all behavior.
|
|
170
173
|
private def build_referenced_by_clause(table)
|
|
174
|
+
# Opt-in multi-referencer reverse scope (TableConfig#reverse_scope): when
|
|
175
|
+
# the schema author has enumerated the referencers explicitly, constrain
|
|
176
|
+
# the table to the UNION of those referencers' scoped queries instead of
|
|
177
|
+
# the single-referencer auto-detection below (which bails to a full dump
|
|
178
|
+
# once two or more tables reference the table).
|
|
179
|
+
if table.reverse_scope && table.reverse_scope.via.any?
|
|
180
|
+
return build_reverse_scope_via_clause(table)
|
|
181
|
+
end
|
|
182
|
+
|
|
171
183
|
candidates = table_by_name.each_value.filter_map do |other|
|
|
172
184
|
next if other.name == table.name
|
|
173
185
|
|
|
@@ -178,10 +190,11 @@ module Exwiw
|
|
|
178
190
|
next if relation.nil? || relation.polymorphic?
|
|
179
191
|
|
|
180
192
|
# Build the child's own extraction query. allow_reverse:false stops a
|
|
181
|
-
# chain of FK-less tables from recursing back into each other;
|
|
182
|
-
#
|
|
183
|
-
#
|
|
184
|
-
|
|
193
|
+
# chain of FK-less tables from recursing back into each other; adding this
|
|
194
|
+
# table to forward_path stops the child from forward-scoping back through
|
|
195
|
+
# it (which would loop) while still letting the child forward-scope
|
|
196
|
+
# through other tables.
|
|
197
|
+
child_query = self.class.run(other.name, table_by_name, dump_target, @logger, allow_reverse: false, forward_path: @forward_path + [table.name])
|
|
185
198
|
|
|
186
199
|
# Only an *already constrained* child narrows anything; an unconstrained
|
|
187
200
|
# child would select every fk value (i.e. dump all) and not help.
|
|
@@ -219,6 +232,64 @@ module Exwiw
|
|
|
219
232
|
)
|
|
220
233
|
end
|
|
221
234
|
|
|
235
|
+
# Multi-referencer reverse scope (TableConfig#reverse_scope). Builds a
|
|
236
|
+
# `pk IN (SELECT ref1.col1 FROM ref1 <scope> UNION SELECT ref2.col2 ...)`
|
|
237
|
+
# clause for a global-identity table referenced by many scoped tables. Each
|
|
238
|
+
# `via` arm reuses the referencer's own (already-scoped) extraction query —
|
|
239
|
+
# so a per-tenant run keeps only that tenant's ids — projected down to the
|
|
240
|
+
# foreign-key column that points at this table, with NULLs excluded.
|
|
241
|
+
#
|
|
242
|
+
# An arm whose referencer is unknown or comes out unconstrained is skipped
|
|
243
|
+
# with a warning rather than included: an unconstrained arm would project
|
|
244
|
+
# every row's id and union the whole table back, silently defeating the
|
|
245
|
+
# prune. Returns nil when no arm survives, leaving the caller to fall back to
|
|
246
|
+
# the dump-all behavior (which validate_scope! then rejects in scope mode).
|
|
247
|
+
private def build_reverse_scope_via_clause(table)
|
|
248
|
+
arms = table.reverse_scope.via.filter_map do |via|
|
|
249
|
+
referencer = table_by_name[via.table]
|
|
250
|
+
if referencer.nil?
|
|
251
|
+
@logger.warn(" #{table.name}.reverse_scope references unknown table '#{via.table}'; skipping arm.")
|
|
252
|
+
next
|
|
253
|
+
end
|
|
254
|
+
|
|
255
|
+
# Build the referencer's own scoped extraction query. allow_reverse is
|
|
256
|
+
# disabled and this table is added to forward_path to bound recursion
|
|
257
|
+
# exactly as the single-referencer path does (a referencer that could only
|
|
258
|
+
# be scoped by recursing back into this table would loop); the referencer
|
|
259
|
+
# may still forward-scope through other tables.
|
|
260
|
+
ref_query = self.class.run(referencer.name, table_by_name, dump_target, @logger, allow_reverse: false, forward_path: @forward_path + [table.name])
|
|
261
|
+
|
|
262
|
+
unless ref_query.where_clauses.any? || ref_query.join_clauses.any?
|
|
263
|
+
@logger.warn(
|
|
264
|
+
" #{table.name}.reverse_scope arm '#{via.table}.#{via.column}' is not scoped; " \
|
|
265
|
+
"skipping it (an unconstrained arm would union every row back). " \
|
|
266
|
+
"Make '#{via.table}' scopable or remove it from reverse_scope.via."
|
|
267
|
+
)
|
|
268
|
+
next
|
|
269
|
+
end
|
|
270
|
+
|
|
271
|
+
# Project the referencer's query to the foreign-key column that points
|
|
272
|
+
# at this table, excluding NULLs. Force a plain column so any masking /
|
|
273
|
+
# raw_sql configured on that column does not corrupt the id comparison.
|
|
274
|
+
fk_column = TableColumn.from_symbol_keys(name: via.column)
|
|
275
|
+
projected = QueryAst::Select.new
|
|
276
|
+
projected.from(ref_query.from_table_name)
|
|
277
|
+
projected.select([fk_column])
|
|
278
|
+
ref_query.join_clauses.each { |j| projected.join(j) }
|
|
279
|
+
ref_query.where_clauses.each { |w| projected.where(w) }
|
|
280
|
+
projected.where(QueryAst::WhereClause.new(column_name: via.column, operator: :not_null))
|
|
281
|
+
projected
|
|
282
|
+
end
|
|
283
|
+
|
|
284
|
+
return nil if arms.empty?
|
|
285
|
+
|
|
286
|
+
QueryAst::WhereClause.new(
|
|
287
|
+
column_name: table.primary_key,
|
|
288
|
+
operator: :in_subquery,
|
|
289
|
+
value: QueryAst::UnionSubquery.new(queries: arms)
|
|
290
|
+
)
|
|
291
|
+
end
|
|
292
|
+
|
|
222
293
|
# Scope-column mode. Builds a `fk IN (SELECT parent.pk FROM <parent
|
|
223
294
|
# extraction query>)` clause for a table whose belongs_to parent is itself
|
|
224
295
|
# scopable but carries no scope column of its own — so find_path_to_scoped
|
|
@@ -230,6 +301,13 @@ module Exwiw
|
|
|
230
301
|
# them out of a full dump. Returns nil when there is no single, unambiguous
|
|
231
302
|
# scopable parent, leaving the caller on the unscopable path.
|
|
232
303
|
private def build_belongs_to_scoped_clause(table)
|
|
304
|
+
# This table plus every ancestor currently being forward-resolved. A
|
|
305
|
+
# candidate parent already on this path would close a belongs_to cycle, so
|
|
306
|
+
# it is skipped; threading the grown path into the parent build lets the
|
|
307
|
+
# cascade recurse N hops (users -> end_users -> end_user_profiles -> ...)
|
|
308
|
+
# and terminate only when a table reappears.
|
|
309
|
+
forward_path = @forward_path + [table.name]
|
|
310
|
+
|
|
233
311
|
candidates = table.belongs_tos.filter_map do |relation|
|
|
234
312
|
# A polymorphic belongs_to points at several parent tables through one
|
|
235
313
|
# column, so it cannot project to a single parent id set; skip it.
|
|
@@ -238,10 +316,15 @@ module Exwiw
|
|
|
238
316
|
parent = table_by_name[relation.table_name]
|
|
239
317
|
next if parent.nil?
|
|
240
318
|
|
|
319
|
+
# Cycle guard: descending into a parent already on the forward path would
|
|
320
|
+
# loop (a -> b -> a). Stop, leaving this table on the :unscopable path.
|
|
321
|
+
next if forward_path.include?(parent.name)
|
|
322
|
+
|
|
241
323
|
# Build the parent's own scoped query. allow_reverse stays true so the
|
|
242
|
-
# parent may be scoped via referenced_by
|
|
243
|
-
#
|
|
244
|
-
|
|
324
|
+
# parent may be scoped via referenced_by, and forward scoping stays
|
|
325
|
+
# enabled so a parent that is itself scoped via *its* parent resolves
|
|
326
|
+
# too — this is what makes the cascade multi-hop.
|
|
327
|
+
parent_query = self.class.run(parent.name, table_by_name, dump_target, @logger, allow_reverse: true, forward_path: forward_path)
|
|
245
328
|
|
|
246
329
|
# Only a constrained parent narrows anything; an unconstrained parent
|
|
247
330
|
# would select every pk (i.e. dump all) and not help.
|
|
@@ -393,11 +476,18 @@ module Exwiw
|
|
|
393
476
|
return :direct if directly_scoped?(table)
|
|
394
477
|
return :via_path if build_join_clauses_scoped(table).any?
|
|
395
478
|
return :referenced_by if @allow_reverse && build_referenced_by_clause(table)
|
|
396
|
-
return :via_scoped_parent if
|
|
479
|
+
return :via_scoped_parent if forward_scope_allowed?(table) && build_belongs_to_scoped_clause(table)
|
|
397
480
|
|
|
398
481
|
:unscopable
|
|
399
482
|
end
|
|
400
483
|
|
|
484
|
+
# True when this table may still attempt the forward "scope via a scoped
|
|
485
|
+
# belongs_to parent" rescue: it is not already on the forward-resolution
|
|
486
|
+
# path, so descending into its parent cannot revisit it (a belongs_to cycle).
|
|
487
|
+
private def forward_scope_allowed?(table)
|
|
488
|
+
!@forward_path.include?(table.name)
|
|
489
|
+
end
|
|
490
|
+
|
|
401
491
|
private def build_scoped(table)
|
|
402
492
|
ast = QueryAst::Select.new
|
|
403
493
|
ast.from(table.name)
|
|
@@ -435,11 +525,13 @@ module Exwiw
|
|
|
435
525
|
end
|
|
436
526
|
end
|
|
437
527
|
|
|
438
|
-
if
|
|
528
|
+
if forward_scope_allowed?(table)
|
|
439
529
|
# Belongs_to a parent that is itself scoped but carries no scope column of
|
|
440
530
|
# its own (so via_path cannot terminate on it) — e.g. a hub table scoped
|
|
441
|
-
# only via referenced_by
|
|
442
|
-
#
|
|
531
|
+
# only via referenced_by, or a parent that is itself scoped through *its*
|
|
532
|
+
# parent. Constrain this table to that parent's in-scope ids so its rows
|
|
533
|
+
# ride along instead of being dumped in full; the parent build recurses
|
|
534
|
+
# the cascade further up.
|
|
443
535
|
parent_clause = build_belongs_to_scoped_clause(table)
|
|
444
536
|
if parent_clause
|
|
445
537
|
ast.where(parent_clause)
|
|
@@ -447,12 +539,13 @@ module Exwiw
|
|
|
447
539
|
end
|
|
448
540
|
end
|
|
449
541
|
|
|
450
|
-
# Only the genuine top-level build (
|
|
451
|
-
#
|
|
452
|
-
#
|
|
453
|
-
#
|
|
454
|
-
# (potential full
|
|
455
|
-
|
|
542
|
+
# Only the genuine top-level build (allow_reverse on, forward_path empty —
|
|
543
|
+
# i.e. no rescue subquery in progress) is allowed to fail hard. The
|
|
544
|
+
# Runner/ExplainRunner pre-flight (validate_scope!) rejects unscopable
|
|
545
|
+
# tables before extraction, so a top-level build never legitimately lands
|
|
546
|
+
# here; if it does, raise rather than emit an unfiltered (potential full
|
|
547
|
+
# PII) dump.
|
|
548
|
+
if @allow_reverse && @forward_path.empty?
|
|
456
549
|
raise ArgumentError, scope_unscopable_message(table)
|
|
457
550
|
end
|
|
458
551
|
|
|
@@ -0,0 +1,47 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module Exwiw
|
|
4
|
+
# One referencer arm of a {ReverseScope}: the referencing table and the column
|
|
5
|
+
# on it that points at the reverse-scoped table's primary key.
|
|
6
|
+
#
|
|
7
|
+
# `column` is given explicitly so a *non-default* foreign key (e.g.
|
|
8
|
+
# `business_entity_customers.kantan_yoyaku_user_id`, or `organization_admins.id`
|
|
9
|
+
# which itself references `users.id`) can be projected — and even a column with
|
|
10
|
+
# no declared `belongs_to` edge can be enumerated.
|
|
11
|
+
class ReverseScopeVia
|
|
12
|
+
include Serdes
|
|
13
|
+
|
|
14
|
+
attribute :table, String
|
|
15
|
+
attribute :column, String
|
|
16
|
+
end
|
|
17
|
+
|
|
18
|
+
# Opt-in configuration for *multi-referencer* reverse scoping
|
|
19
|
+
# (see {QueryAstBuilder#build_referenced_by_clause}).
|
|
20
|
+
#
|
|
21
|
+
# A global-identity table such as `users` carries no scope/tenant column and
|
|
22
|
+
# has no `belongs_to` path of its own to the dump target; many tenant-owned
|
|
23
|
+
# tables instead point *at* it. The automatic single-referencer reverse
|
|
24
|
+
# extraction only narrows a table referenced by exactly one constrained child
|
|
25
|
+
# — with two or more referencers it falls back to a full dump. `reverse_scope`
|
|
26
|
+
# lets the schema author enumerate the referencers whose own (already-scoped)
|
|
27
|
+
# extraction queries should be UNION'd into the id set this table is
|
|
28
|
+
# constrained to:
|
|
29
|
+
#
|
|
30
|
+
# <table>.<pk> IN (
|
|
31
|
+
# SELECT <ref1>.<col1> FROM <ref1> <ref1 scope> WHERE <col1> IS NOT NULL
|
|
32
|
+
# UNION SELECT <ref2>.<col2> FROM <ref2> <ref2 scope> WHERE <col2> IS NOT NULL
|
|
33
|
+
# UNION ...
|
|
34
|
+
# )
|
|
35
|
+
#
|
|
36
|
+
# It is deliberately explicit — never inferred or emitted by the schema
|
|
37
|
+
# generators, and preserved across regeneration like the other user-owned
|
|
38
|
+
# config (see {TableConfig#merge}). Only referencers that are themselves scoped
|
|
39
|
+
# belong in `via`: an unconstrained referencer would project every row's id and
|
|
40
|
+
# union the whole table back, defeating the prune (such an arm is skipped with
|
|
41
|
+
# a warning rather than silently widening the dump).
|
|
42
|
+
class ReverseScope
|
|
43
|
+
include Serdes
|
|
44
|
+
|
|
45
|
+
attribute :via, array(ReverseScopeVia), default: []
|
|
46
|
+
end
|
|
47
|
+
end
|
data/lib/exwiw/table_config.rb
CHANGED
|
@@ -39,6 +39,14 @@ module Exwiw
|
|
|
39
39
|
attribute :scope_exempt, Serdes::OptionalType.new(Serdes::ConcreteType.new(Boolean)), skip_serializing_if_nil: true
|
|
40
40
|
attribute :scope_column, optional(String), skip_serializing_if_nil: true
|
|
41
41
|
|
|
42
|
+
# `reverse_scope` opts a table into multi-referencer reverse scoping (see
|
|
43
|
+
# Exwiw::ReverseScope and QueryAstBuilder#build_referenced_by_clause): a
|
|
44
|
+
# global-identity table (e.g. `users`) referenced by many scoped tables is
|
|
45
|
+
# constrained to the UNION of those referencers' projected foreign keys
|
|
46
|
+
# instead of being dumped in full. User-configured and never emitted by the
|
|
47
|
+
# schema generators.
|
|
48
|
+
attribute :reverse_scope, Serdes::OptionalType.new(ReverseScope), skip_serializing_if_nil: true
|
|
49
|
+
|
|
42
50
|
def self.from(hash)
|
|
43
51
|
config = super
|
|
44
52
|
config.send(:validate_after_load!)
|
|
@@ -58,6 +66,7 @@ module Exwiw
|
|
|
58
66
|
if rails_managed?
|
|
59
67
|
hash.delete("belongs_tos")
|
|
60
68
|
hash.delete("columns")
|
|
69
|
+
hash.delete("reverse_scope")
|
|
61
70
|
end
|
|
62
71
|
hash
|
|
63
72
|
end
|
|
@@ -152,6 +161,7 @@ module Exwiw
|
|
|
152
161
|
# User-owned, never regenerated: carry over from the existing config.
|
|
153
162
|
merged_table.scope_exempt = scope_exempt
|
|
154
163
|
merged_table.scope_column = scope_column
|
|
164
|
+
merged_table.reverse_scope = reverse_scope
|
|
155
165
|
|
|
156
166
|
# Structural facts of each belongs_to come from the freshly generated
|
|
157
167
|
# config, but the user-owned `comment`/`ignore`/`ignore_type`/`references`
|
|
@@ -199,6 +209,10 @@ module Exwiw
|
|
|
199
209
|
raise ArgumentError,
|
|
200
210
|
"Table '#{name}' has type=#{type}; columns must not be defined."
|
|
201
211
|
end
|
|
212
|
+
if reverse_scope
|
|
213
|
+
raise ArgumentError,
|
|
214
|
+
"Table '#{name}' has type=#{type}; reverse_scope must not be defined."
|
|
215
|
+
end
|
|
202
216
|
else
|
|
203
217
|
# An ignore:true table is not extracted, so primary_key is not required
|
|
204
218
|
# (e.g. a composite-primary-key table that exwiw does not support).
|
data/lib/exwiw/version.rb
CHANGED
data/lib/exwiw.rb
CHANGED
|
@@ -9,6 +9,7 @@ require_relative "exwiw/ext_json"
|
|
|
9
9
|
require_relative "exwiw/config_file"
|
|
10
10
|
require_relative "exwiw/belongs_to"
|
|
11
11
|
require_relative "exwiw/table_column"
|
|
12
|
+
require_relative "exwiw/reverse_scope"
|
|
12
13
|
require_relative "exwiw/table_config"
|
|
13
14
|
require_relative "exwiw/embedded_in"
|
|
14
15
|
require_relative "exwiw/mongodb_field"
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: exwiw
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.8.
|
|
4
|
+
version: 0.8.3
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Shia
|
|
@@ -76,6 +76,7 @@ files:
|
|
|
76
76
|
- lib/exwiw/query_ast.rb
|
|
77
77
|
- lib/exwiw/query_ast_builder.rb
|
|
78
78
|
- lib/exwiw/railtie.rb
|
|
79
|
+
- lib/exwiw/reverse_scope.rb
|
|
79
80
|
- lib/exwiw/runner.rb
|
|
80
81
|
- lib/exwiw/schema_generator.rb
|
|
81
82
|
- lib/exwiw/table_column.rb
|