exwiw 0.8.1 → 0.8.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 80610cc2d13a87793171563b2b8e4ff0568135eb987b550dd9f89bffe89b1d67
4
- data.tar.gz: 5e9f4976043571647163e9f743c3c8fb709a29944b590c55424dce374cca3269
3
+ metadata.gz: 90d949cb54565ed644b599102cfabc16236fe2d83523594cd7b14f269118a9b2
4
+ data.tar.gz: 7236bb6ee6b9dda38aec93491f0266e334890e3c271b4f7d5d072d46a1e0f88b
5
5
  SHA512:
6
- metadata.gz: 8508aaa2d9cba3310a9ee4d4b940682b894bbafcab9672e4ccca5fb8f1a4b43e6fe9c36cbfcd0efe8f12c5308fa9be33aeb16d53b6d98ef71692ec2e30076be0
7
- data.tar.gz: 8a1a8187e7547f4ce0eeaee458d36b16e56c08c1f2275977127de56a49eb2a3b25e22d8c52d3678fa6ea91e553ebf6c69bf943e5bb108e3984fa8fc34f03a2f0
6
+ metadata.gz: bcfac0aaf220b55dfa3172f94d2ad0a6e828f53253e367aa9203db7061695fa6a9188e22a92c18d51bd9a55b6a9bf840cf780a3647d4aca2a0cccff1a785a347
7
+ data.tar.gz: d891fc7101fea30d674b933213715a8a4534c459e191b9bf4fce31b3bda8a9be037bf09627419f43ef914d2a9288aa40e1cd4836a4d8e1e0901d73e320218cd3
data/CHANGELOG.md CHANGED
@@ -2,6 +2,18 @@
2
2
 
3
3
  ## [Unreleased]
4
4
 
5
+ ## [0.8.3] - 2026-06-24
6
+
7
+ ### Fixed
8
+
9
+ - **Forward scope (`via_scoped_parent`) now cascades across multiple `belongs_to` hops instead of dying after one.** A table with no scope column of its own is scoped by constraining it to its `belongs_to` parent's in-scope ids (`fk IN (SELECT parent.pk FROM <parent's scoped query>)`). Previously the parent was rebuilt with forward scoping turned off, so if the parent was *itself* scoped only through *its* parent (e.g. an identity-family table two or more hops below a `reverse_scope`/`referenced_by` table — `users ← end_users ← end_user_profiles`), the rebuilt parent came back unconstrained and the child was classified `:unscopable` — forcing a `scope_exempt` full dump and re-introducing the bloat the prune removes. The boolean single-hop bound is replaced by a forward-path guard: the rescue keeps forward scoping enabled while rebuilding the parent (appending the current table to the path), so the cascade recurses N levels and produces a correspondingly nested `IN (subquery)`; it terminates only on a genuine `belongs_to` cycle (a table already on the path is not revisited, falling through to `:unscopable`). The single-unambiguous-parent rule and the polymorphic-skip are unchanged, and the reverse arms still cannot loop back through the table being reverse-scoped. SQL adapters only.
10
+
11
+ ## [0.8.2] - 2026-06-24
12
+
13
+ ### Added
14
+
15
+ - **Multi-referencer reverse scoping (`reverse_scope`).** Reverse / "referenced_by" extraction previously narrowed only a table referenced by *exactly one* constrained child; a table referenced by two or more (most importantly a **global-identity table** like `users`, which carries no scope/tenant column and has no `belongs_to` of its own, yet many scoped tables point *at* it) fell back to dumping every row — dragging in every tenant's identities. A table can now opt into **multi-referencer** reverse scoping with a user-owned `reverse_scope: { via: [{ table, column }, …] }` key listing the referencers whose own (already scoped) extraction queries are `UNION`'d into the id set it is constrained to (`pk IN (SELECT ref1.col1 FROM ref1 <scope> UNION SELECT ref2.col2 FROM ref2 <scope> …)`). Each arm reuses that referencer's own scope, so a per-tenant run keeps only that tenant's ids; the named `column` is explicit, so a non-default foreign key (or a column with no declared `belongs_to`) is honored; NULLs are excluded per arm. Only **scoped** referencers belong in `via` — an unconstrained arm (e.g. a `scope_exempt` referencer) would union every row back, so it is skipped with a warning rather than silently widening the dump, and an unknown table is likewise skipped; if no arm survives, the table stays unscopable (so scope-column mode's `validate_scope!` still aborts rather than dumping it in full). Tables that `belongs_to` the reverse-scoped table tighten automatically through the existing cascade and need no config. `reverse_scope` is never emitted by `schema:generate` and is preserved across regeneration like `scope_exempt`/`scope_column`. SQL adapters only. See the [README](README.md#reverse-scope-for-multi-referencer-tables-reverse_scope).
16
+
5
17
  ## [0.8.1] - 2026-06-24
6
18
 
7
19
  ### Fixed
data/README.md CHANGED
@@ -183,8 +183,11 @@ Each table is resolved as follows:
183
183
  (`fk IN (SELECT parent.pk FROM <parent's scoped query>)`). This covers a *hub*
184
184
  table that has no scope column and is scoped only because an extractable child
185
185
  references it (see referenced-by below): the hub's other `belongs_to` children
186
- ride along to just the in-scope rows instead of being dumped in full. Limited to
187
- a single forward hop and a single unambiguous scopable parent.
186
+ ride along to just the in-scope rows instead of being dumped in full. The parent
187
+ itself may be scoped the same way, so this **cascades across multiple hops**
188
+ (each a single unambiguous scopable parent) and the subquery nests
189
+ correspondingly; the recursion terminates on a genuine `belongs_to` cycle (a
190
+ table already on the path is left `:unscopable` rather than looped on).
188
191
  - **Cannot be scoped at all** (no scope column and no path to one) → exwiw
189
192
  **aborts** and lists the offending tables, so an unscoped table is never silently
190
193
  dumped in full. For each, either declare a `scope_column`, add a `belongs_to`
@@ -579,6 +582,49 @@ ActiveStorage is handled automatically — no ActiveStorage-specific configurati
579
582
  `active_storage_variant_records` also references blobs, but since it has no path of its own to the dump target it doesn't constrain anything and is ignored as a referencer — blobs stays narrowed to the attachment-referenced ids. (A parent referenced by *multiple* constrained children currently falls back to dumping all of its rows.)
580
583
  - **`active_storage_variant_records`** holds derivative variant-tracking rows that ActiveStorage regenerates lazily, and it too has no path to the dump target — left alone it would land in the "no relation → dump all" branch and, worse, its `blob_id` could point at blobs outside the narrowed set above (a foreign-key violation on import). `exwiw:schema:generate` therefore emits it with **`ignore: true`** (and drops it from the attachments `record` polymorphic expansion so nothing carries a dangling reference to it), so its data is skipped while the DDL is still written. Remove `ignore` from the generated config if you really need to export it.
581
584
 
585
+ ### Reverse scope for multi-referencer tables (`reverse_scope`)
586
+
587
+ The automatic reverse extraction above narrows a table referenced by **exactly one** constrained child. A table referenced by **two or more** constrained children falls back to dumping every row — fine for `active_storage_blobs`, but a problem for a **global-identity table** such as `users`: it carries no scope/tenant column and has no `belongs_to` of its own, yet dozens of scoped tables point *at* it. Dumping it (and everything that hangs off it) in full pulls in every tenant's identities.
588
+
589
+ `reverse_scope` opts such a table into **multi-referencer** reverse scoping: you enumerate the referencers whose own (already scoped) extraction queries should be `UNION`'d into the id set the table is constrained to. It is a user-owned key (never emitted by `schema:generate`, preserved across regeneration like `scope_exempt`/`scope_column`):
590
+
591
+ ```json
592
+ {
593
+ "name": "users",
594
+ "primary_key": "id",
595
+ "reverse_scope": {
596
+ "via": [
597
+ { "table": "customers", "column": "user_id" },
598
+ { "table": "staff", "column": "user_id" },
599
+ { "table": "business_entity_customers", "column": "kantan_yoyaku_user_id" }
600
+ ]
601
+ },
602
+ "columns": [{ "name": "id" }, { "name": "name" }]
603
+ }
604
+ ```
605
+
606
+ produces (each arm reuses that referencer's own scope, so a per-tenant run keeps only that tenant's ids):
607
+
608
+ ```sql
609
+ SELECT users.* FROM users
610
+ WHERE users.id IN (
611
+ SELECT customers.user_id FROM customers WHERE <customers' scope> AND customers.user_id IS NOT NULL
612
+ UNION
613
+ SELECT staff.user_id FROM staff WHERE <staff' scope> AND staff.user_id IS NOT NULL
614
+ UNION
615
+ SELECT business_entity_customers.kantan_yoyaku_user_id FROM business_entity_customers
616
+ WHERE <…' scope> AND business_entity_customers.kantan_yoyaku_user_id IS NOT NULL
617
+ )
618
+ ```
619
+
620
+ Notes:
621
+
622
+ - **`column` is explicit**, so a *non-default* foreign key (e.g. `kantan_yoyaku_user_id`, or `organization_admins.id` which itself references `users.id`) is honored, and even a column with no declared `belongs_to` edge can be enumerated.
623
+ - **Only scoped referencers belong in `via`.** Each arm's query must come out constrained; an unconstrained referencer (e.g. a `scope_exempt` table, or one with no path to a scope) would project *every* id and union the whole table back — so such an arm is **skipped with a warning** rather than silently widening the dump. An unknown table is likewise skipped with a warning. If no arm survives, the table stays unscopable and (in [scope-column mode](#scope-column-mode)) the run aborts via `validate_scope!`.
624
+ - **NULLs are excluded** per arm (`IS NOT NULL`).
625
+ - **Satellites need no config.** A table that `belongs_to` the reverse-scoped table (e.g. `end_users.id → users.id`, or `identities.user_id → users.id`) tightens to the kept ids automatically through the normal cascade — only the reverse-scoped table itself declares `reverse_scope`. The cascade is **multi-hop**, so a table several `belongs_to` hops below the reverse-scoped table (e.g. `end_user_profiles → end_users → users`) also tightens automatically, with no config of its own.
626
+ - Works in both single-target and scope-column mode. Polymorphic foreign keys are not eligible as anchors (the named `column` is always a concrete column).
627
+
582
628
  ### Rails-managed tables (special `type` values)
583
629
 
584
630
  Some tables are owned by Rails itself rather than the application — they have no ActiveRecord model and Rails reserves the right to evolve their column shape between versions (e.g. `schema_migrations`, `ar_internal_metadata`). exwiw treats them as a distinct category via the `type` field on a table config:
@@ -280,6 +280,8 @@ module Exwiw
280
280
  end
281
281
  elsif where_clause.operator == :in_subquery
282
282
  "#{key} IN (#{compile_subquery(where_clause.value)})"
283
+ elsif where_clause.operator == :not_null
284
+ "#{key} IS NOT NULL"
283
285
  else
284
286
  raise "Unsupported operator: #{where_clause.operator}"
285
287
  end
@@ -290,6 +292,12 @@ module Exwiw
290
292
  # extraction query, projected to a foreign key); compile it as-is.
291
293
  return compile_ast(subquery.query) if subquery.is_a?(Exwiw::QueryAst::SelectSubquery)
292
294
 
295
+ # A UnionSubquery wraps several such Selects; UNION their compiled forms
296
+ # into a single id set.
297
+ if subquery.is_a?(Exwiw::QueryAst::UnionSubquery)
298
+ return subquery.queries.map { |q| compile_ast(q) }.join(' UNION ')
299
+ end
300
+
293
301
  inner_values = subquery.where_values.map { |v| escape_value(v) }
294
302
  "SELECT #{subquery.table_name}.#{subquery.select_column} " \
295
303
  "FROM #{subquery.table_name} " \
@@ -364,6 +364,8 @@ module Exwiw
364
364
  cast_to = subquery_cast_to(where_clause.value, table_name, where_clause.column_name)
365
365
  outer_key = cast_to ? "#{key}::#{cast_to}" : key
366
366
  "#{outer_key} IN (#{subquery_sql})"
367
+ elsif where_clause.operator == :not_null
368
+ "#{key} IS NOT NULL"
367
369
  else
368
370
  raise "Unsupported operator: #{where_clause.operator}"
369
371
  end
@@ -376,6 +378,15 @@ module Exwiw
376
378
  return compile_ast(subquery.query, select_cast_to: cast_to)
377
379
  end
378
380
 
381
+ # A UnionSubquery wraps several projected Selects; UNION their compiled
382
+ # forms. cast_to is the union-wide decision (see union_cast_to): when any
383
+ # arm's column type would clash with the outer column or another arm,
384
+ # every arm's projected column and the outer key are cast to text so the
385
+ # UNION and the enclosing IN comparison resolve to one type.
386
+ if subquery.is_a?(Exwiw::QueryAst::UnionSubquery)
387
+ return subquery.queries.map { |q| compile_ast(q, select_cast_to: cast_to) }.join(' UNION ')
388
+ end
389
+
379
390
  inner_values = subquery.where_values.map { |v| escape_value(v) }
380
391
  select_expr = "#{subquery.table_name}.#{subquery.select_column}"
381
392
  select_expr = "#{select_expr}::#{cast_to}" if cast_to
@@ -400,6 +411,11 @@ module Exwiw
400
411
  private def subquery_cast_to(subquery, outer_table, outer_column)
401
412
  return nil if outer_table.nil? || outer_column.nil?
402
413
 
414
+ # A UNION's arms (and the enclosing IN comparison) must all resolve to
415
+ # one type, so the cast decision must weigh every arm — not just one, as
416
+ # a flat Subquery would.
417
+ return union_cast_to(subquery, outer_table, outer_column) if subquery.is_a?(Exwiw::QueryAst::UnionSubquery)
418
+
403
419
  inner_table, inner_column = subquery_select_target(subquery)
404
420
  return nil if inner_table.nil?
405
421
 
@@ -408,6 +424,23 @@ module Exwiw
408
424
  types_need_cast?(outer_type, inner_type) ? 'text' : nil
409
425
  end
410
426
 
427
+ # Postgres rejects a UNION (or an `IN`) that mixes incompatible types
428
+ # (e.g. uuid and varchar). Examining only the first arm is not enough: a
429
+ # heterogeneous later arm would go uncast and break at execution. So
430
+ # consider the outer column together with every arm's projected column and,
431
+ # if ANY pair needs reconciliation, cast them all to text.
432
+ private def union_cast_to(union, outer_table, outer_column)
433
+ types = [column_pg_type(outer_table, outer_column)]
434
+ union.queries.each do |q|
435
+ col = q.columns.first
436
+ types << column_pg_type(q.from_table_name, col.name) if col
437
+ end
438
+ types.compact!
439
+
440
+ needs_cast = types.combination(2).any? { |a, b| types_need_cast?(a, b) }
441
+ needs_cast ? 'text' : nil
442
+ end
443
+
411
444
  private def escape_value(value)
412
445
  case value
413
446
  when nil
@@ -249,6 +249,8 @@ module Exwiw
249
249
  end
250
250
  elsif where_clause.operator == :in_subquery
251
251
  "#{key} IN (#{compile_subquery(where_clause.value)})"
252
+ elsif where_clause.operator == :not_null
253
+ "#{key} IS NOT NULL"
252
254
  else
253
255
  raise "Unsupported operator: #{where_clause.operator}"
254
256
  end
@@ -259,6 +261,12 @@ module Exwiw
259
261
  # extraction query, projected to a foreign key); compile it as-is.
260
262
  return compile_ast(subquery.query) if subquery.is_a?(Exwiw::QueryAst::SelectSubquery)
261
263
 
264
+ # A UnionSubquery wraps several such Selects; UNION their compiled forms
265
+ # into a single id set.
266
+ if subquery.is_a?(Exwiw::QueryAst::UnionSubquery)
267
+ return subquery.queries.map { |q| compile_ast(q) }.join(' UNION ')
268
+ end
269
+
262
270
  inner_values = subquery.where_values.map { |v| escape_value(v) }
263
271
  "SELECT #{subquery.table_name}.#{subquery.select_column} " \
264
272
  "FROM #{subquery.table_name} " \
@@ -41,7 +41,7 @@ module Exwiw
41
41
  {
42
42
  column_name: column_name,
43
43
  operator: operator,
44
- value: value.is_a?(Subquery) || value.is_a?(SelectSubquery) ? value.to_h : value,
44
+ value: value.is_a?(Subquery) || value.is_a?(SelectSubquery) || value.is_a?(UnionSubquery) ? value.to_h : value,
45
45
  }
46
46
  end
47
47
  end
@@ -83,6 +83,25 @@ module Exwiw
83
83
  end
84
84
  end
85
85
 
86
+ # A subquery that UNIONs several single-column `Select`s into one id set.
87
+ # Used by *multi-referencer* reverse / "referenced_by" extraction
88
+ # (TableConfig#reverse_scope): a parent table referenced by many scoped
89
+ # tables is constrained to the union of every referencer's projected foreign
90
+ # key, rather than falling back to dumping every row:
91
+ #
92
+ # <parent>.<pk> IN (
93
+ # SELECT <ref1>.<col1> FROM <ref1> WHERE <ref1 scope> AND <col1> IS NOT NULL
94
+ # UNION SELECT <ref2>.<col2> FROM <ref2> WHERE <ref2 scope> AND <col2> IS NOT NULL
95
+ # )
96
+ #
97
+ # Each `queries` element is a `Select` already projected to the foreign-key
98
+ # column that points at the parent (with a NULL-excluding filter).
99
+ UnionSubquery = Struct.new(:queries, keyword_init: true) do
100
+ def to_h
101
+ { union: queries.map(&:to_h) }
102
+ end
103
+ end
104
+
86
105
  module ColumnValue
87
106
  Base = Struct.new(:name, :value, keyword_init: true)
88
107
  Plain = Class.new(Base)
@@ -2,8 +2,8 @@
2
2
 
3
3
  module Exwiw
4
4
  class QueryAstBuilder
5
- def self.run(table_name, table_by_name, dump_target, logger, allow_reverse: true, allow_forward: true)
6
- new(table_name, table_by_name, dump_target, logger, allow_reverse: allow_reverse, allow_forward: allow_forward).run
5
+ def self.run(table_name, table_by_name, dump_target, logger, allow_reverse: true, forward_path: [])
6
+ new(table_name, table_by_name, dump_target, logger, allow_reverse: allow_reverse, forward_path: forward_path).run
7
7
  end
8
8
 
9
9
  # Scope-column mode classification for a single table. One of
@@ -49,17 +49,20 @@ module Exwiw
49
49
 
50
50
  attr_reader :table_name, :table_by_name, :dump_target
51
51
 
52
- def initialize(table_name, table_by_name, dump_target, logger, allow_reverse: true, allow_forward: true)
52
+ def initialize(table_name, table_by_name, dump_target, logger, allow_reverse: true, forward_path: [])
53
53
  @table_name = table_name
54
54
  @table_by_name = table_by_name
55
55
  @dump_target = dump_target
56
56
  @logger = logger
57
57
  @allow_reverse = allow_reverse
58
- # @allow_forward gates the "scope via an indirectly-scoped belongs_to
59
- # parent" rescue (build_belongs_to_scoped_clause). Disabled while building a
60
- # parent/child subquery so a single forward hop never recurses into another
61
- # (which could loop on a belongs_to cycle).
62
- @allow_forward = allow_forward
58
+ # @forward_path is the chain of tables currently being forward-resolved by
59
+ # the "scope via an indirectly-scoped belongs_to parent" rescue
60
+ # (build_belongs_to_scoped_clause). Each forward hop appends the table it is
61
+ # descending from, so the rescue recurses N levels (users -> end_users ->
62
+ # end_user_profiles -> ...) and stops only on a real belongs_to cycle: a
63
+ # table already on the path is not re-resolved, falling through to
64
+ # :unscopable instead of looping forever.
65
+ @forward_path = forward_path
63
66
  end
64
67
 
65
68
  def run
@@ -168,6 +171,15 @@ module Exwiw
168
171
  # such (single, unambiguous) referencer, leaving the caller to fall back to
169
172
  # the dump-all behavior.
170
173
  private def build_referenced_by_clause(table)
174
+ # Opt-in multi-referencer reverse scope (TableConfig#reverse_scope): when
175
+ # the schema author has enumerated the referencers explicitly, constrain
176
+ # the table to the UNION of those referencers' scoped queries instead of
177
+ # the single-referencer auto-detection below (which bails to a full dump
178
+ # once two or more tables reference the table).
179
+ if table.reverse_scope && table.reverse_scope.via.any?
180
+ return build_reverse_scope_via_clause(table)
181
+ end
182
+
171
183
  candidates = table_by_name.each_value.filter_map do |other|
172
184
  next if other.name == table.name
173
185
 
@@ -178,10 +190,11 @@ module Exwiw
178
190
  next if relation.nil? || relation.polymorphic?
179
191
 
180
192
  # Build the child's own extraction query. allow_reverse:false stops a
181
- # chain of FK-less tables from recursing back into each other;
182
- # allow_forward:false stops the child from forward-scoping back through
183
- # this very table (which would loop).
184
- child_query = self.class.run(other.name, table_by_name, dump_target, @logger, allow_reverse: false, allow_forward: false)
193
+ # chain of FK-less tables from recursing back into each other; adding this
194
+ # table to forward_path stops the child from forward-scoping back through
195
+ # it (which would loop) while still letting the child forward-scope
196
+ # through other tables.
197
+ child_query = self.class.run(other.name, table_by_name, dump_target, @logger, allow_reverse: false, forward_path: @forward_path + [table.name])
185
198
 
186
199
  # Only an *already constrained* child narrows anything; an unconstrained
187
200
  # child would select every fk value (i.e. dump all) and not help.
@@ -219,6 +232,64 @@ module Exwiw
219
232
  )
220
233
  end
221
234
 
235
+ # Multi-referencer reverse scope (TableConfig#reverse_scope). Builds a
236
+ # `pk IN (SELECT ref1.col1 FROM ref1 <scope> UNION SELECT ref2.col2 ...)`
237
+ # clause for a global-identity table referenced by many scoped tables. Each
238
+ # `via` arm reuses the referencer's own (already-scoped) extraction query —
239
+ # so a per-tenant run keeps only that tenant's ids — projected down to the
240
+ # foreign-key column that points at this table, with NULLs excluded.
241
+ #
242
+ # An arm whose referencer is unknown or comes out unconstrained is skipped
243
+ # with a warning rather than included: an unconstrained arm would project
244
+ # every row's id and union the whole table back, silently defeating the
245
+ # prune. Returns nil when no arm survives, leaving the caller to fall back to
246
+ # the dump-all behavior (which validate_scope! then rejects in scope mode).
247
+ private def build_reverse_scope_via_clause(table)
248
+ arms = table.reverse_scope.via.filter_map do |via|
249
+ referencer = table_by_name[via.table]
250
+ if referencer.nil?
251
+ @logger.warn(" #{table.name}.reverse_scope references unknown table '#{via.table}'; skipping arm.")
252
+ next
253
+ end
254
+
255
+ # Build the referencer's own scoped extraction query. allow_reverse is
256
+ # disabled and this table is added to forward_path to bound recursion
257
+ # exactly as the single-referencer path does (a referencer that could only
258
+ # be scoped by recursing back into this table would loop); the referencer
259
+ # may still forward-scope through other tables.
260
+ ref_query = self.class.run(referencer.name, table_by_name, dump_target, @logger, allow_reverse: false, forward_path: @forward_path + [table.name])
261
+
262
+ unless ref_query.where_clauses.any? || ref_query.join_clauses.any?
263
+ @logger.warn(
264
+ " #{table.name}.reverse_scope arm '#{via.table}.#{via.column}' is not scoped; " \
265
+ "skipping it (an unconstrained arm would union every row back). " \
266
+ "Make '#{via.table}' scopable or remove it from reverse_scope.via."
267
+ )
268
+ next
269
+ end
270
+
271
+ # Project the referencer's query to the foreign-key column that points
272
+ # at this table, excluding NULLs. Force a plain column so any masking /
273
+ # raw_sql configured on that column does not corrupt the id comparison.
274
+ fk_column = TableColumn.from_symbol_keys(name: via.column)
275
+ projected = QueryAst::Select.new
276
+ projected.from(ref_query.from_table_name)
277
+ projected.select([fk_column])
278
+ ref_query.join_clauses.each { |j| projected.join(j) }
279
+ ref_query.where_clauses.each { |w| projected.where(w) }
280
+ projected.where(QueryAst::WhereClause.new(column_name: via.column, operator: :not_null))
281
+ projected
282
+ end
283
+
284
+ return nil if arms.empty?
285
+
286
+ QueryAst::WhereClause.new(
287
+ column_name: table.primary_key,
288
+ operator: :in_subquery,
289
+ value: QueryAst::UnionSubquery.new(queries: arms)
290
+ )
291
+ end
292
+
222
293
  # Scope-column mode. Builds a `fk IN (SELECT parent.pk FROM <parent
223
294
  # extraction query>)` clause for a table whose belongs_to parent is itself
224
295
  # scopable but carries no scope column of its own — so find_path_to_scoped
@@ -230,6 +301,13 @@ module Exwiw
230
301
  # them out of a full dump. Returns nil when there is no single, unambiguous
231
302
  # scopable parent, leaving the caller on the unscopable path.
232
303
  private def build_belongs_to_scoped_clause(table)
304
+ # This table plus every ancestor currently being forward-resolved. A
305
+ # candidate parent already on this path would close a belongs_to cycle, so
306
+ # it is skipped; threading the grown path into the parent build lets the
307
+ # cascade recurse N hops (users -> end_users -> end_user_profiles -> ...)
308
+ # and terminate only when a table reappears.
309
+ forward_path = @forward_path + [table.name]
310
+
233
311
  candidates = table.belongs_tos.filter_map do |relation|
234
312
  # A polymorphic belongs_to points at several parent tables through one
235
313
  # column, so it cannot project to a single parent id set; skip it.
@@ -238,10 +316,15 @@ module Exwiw
238
316
  parent = table_by_name[relation.table_name]
239
317
  next if parent.nil?
240
318
 
319
+ # Cycle guard: descending into a parent already on the forward path would
320
+ # loop (a -> b -> a). Stop, leaving this table on the :unscopable path.
321
+ next if forward_path.include?(parent.name)
322
+
241
323
  # Build the parent's own scoped query. allow_reverse stays true so the
242
- # parent may be scoped via referenced_by; allow_forward:false bounds this
243
- # to a single forward hop so a belongs_to cycle cannot loop.
244
- parent_query = self.class.run(parent.name, table_by_name, dump_target, @logger, allow_reverse: true, allow_forward: false)
324
+ # parent may be scoped via referenced_by, and forward scoping stays
325
+ # enabled so a parent that is itself scoped via *its* parent resolves
326
+ # too this is what makes the cascade multi-hop.
327
+ parent_query = self.class.run(parent.name, table_by_name, dump_target, @logger, allow_reverse: true, forward_path: forward_path)
245
328
 
246
329
  # Only a constrained parent narrows anything; an unconstrained parent
247
330
  # would select every pk (i.e. dump all) and not help.
@@ -393,11 +476,18 @@ module Exwiw
393
476
  return :direct if directly_scoped?(table)
394
477
  return :via_path if build_join_clauses_scoped(table).any?
395
478
  return :referenced_by if @allow_reverse && build_referenced_by_clause(table)
396
- return :via_scoped_parent if @allow_forward && build_belongs_to_scoped_clause(table)
479
+ return :via_scoped_parent if forward_scope_allowed?(table) && build_belongs_to_scoped_clause(table)
397
480
 
398
481
  :unscopable
399
482
  end
400
483
 
484
+ # True when this table may still attempt the forward "scope via a scoped
485
+ # belongs_to parent" rescue: it is not already on the forward-resolution
486
+ # path, so descending into its parent cannot revisit it (a belongs_to cycle).
487
+ private def forward_scope_allowed?(table)
488
+ !@forward_path.include?(table.name)
489
+ end
490
+
401
491
  private def build_scoped(table)
402
492
  ast = QueryAst::Select.new
403
493
  ast.from(table.name)
@@ -435,11 +525,13 @@ module Exwiw
435
525
  end
436
526
  end
437
527
 
438
- if @allow_forward
528
+ if forward_scope_allowed?(table)
439
529
  # Belongs_to a parent that is itself scoped but carries no scope column of
440
530
  # its own (so via_path cannot terminate on it) — e.g. a hub table scoped
441
- # only via referenced_by. Constrain this table to that parent's in-scope
442
- # ids so its rows ride along instead of being dumped in full.
531
+ # only via referenced_by, or a parent that is itself scoped through *its*
532
+ # parent. Constrain this table to that parent's in-scope ids so its rows
533
+ # ride along instead of being dumped in full; the parent build recurses
534
+ # the cascade further up.
443
535
  parent_clause = build_belongs_to_scoped_clause(table)
444
536
  if parent_clause
445
537
  ast.where(parent_clause)
@@ -447,12 +539,13 @@ module Exwiw
447
539
  end
448
540
  end
449
541
 
450
- # Only the genuine top-level build (no rescue disabled) is allowed to fail
451
- # hard. The Runner/ExplainRunner pre-flight (validate_scope!) rejects
452
- # unscopable tables before extraction, so a top-level build never
453
- # legitimately lands here; if it does, raise rather than emit an unfiltered
454
- # (potential full PII) dump.
455
- if @allow_reverse && @allow_forward
542
+ # Only the genuine top-level build (allow_reverse on, forward_path empty
543
+ # i.e. no rescue subquery in progress) is allowed to fail hard. The
544
+ # Runner/ExplainRunner pre-flight (validate_scope!) rejects unscopable
545
+ # tables before extraction, so a top-level build never legitimately lands
546
+ # here; if it does, raise rather than emit an unfiltered (potential full
547
+ # PII) dump.
548
+ if @allow_reverse && @forward_path.empty?
456
549
  raise ArgumentError, scope_unscopable_message(table)
457
550
  end
458
551
 
@@ -0,0 +1,47 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Exwiw
4
+ # One referencer arm of a {ReverseScope}: the referencing table and the column
5
+ # on it that points at the reverse-scoped table's primary key.
6
+ #
7
+ # `column` is given explicitly so a *non-default* foreign key (e.g.
8
+ # `business_entity_customers.kantan_yoyaku_user_id`, or `organization_admins.id`
9
+ # which itself references `users.id`) can be projected — and even a column with
10
+ # no declared `belongs_to` edge can be enumerated.
11
+ class ReverseScopeVia
12
+ include Serdes
13
+
14
+ attribute :table, String
15
+ attribute :column, String
16
+ end
17
+
18
+ # Opt-in configuration for *multi-referencer* reverse scoping
19
+ # (see {QueryAstBuilder#build_referenced_by_clause}).
20
+ #
21
+ # A global-identity table such as `users` carries no scope/tenant column and
22
+ # has no `belongs_to` path of its own to the dump target; many tenant-owned
23
+ # tables instead point *at* it. The automatic single-referencer reverse
24
+ # extraction only narrows a table referenced by exactly one constrained child
25
+ # — with two or more referencers it falls back to a full dump. `reverse_scope`
26
+ # lets the schema author enumerate the referencers whose own (already-scoped)
27
+ # extraction queries should be UNION'd into the id set this table is
28
+ # constrained to:
29
+ #
30
+ # <table>.<pk> IN (
31
+ # SELECT <ref1>.<col1> FROM <ref1> <ref1 scope> WHERE <col1> IS NOT NULL
32
+ # UNION SELECT <ref2>.<col2> FROM <ref2> <ref2 scope> WHERE <col2> IS NOT NULL
33
+ # UNION ...
34
+ # )
35
+ #
36
+ # It is deliberately explicit — never inferred or emitted by the schema
37
+ # generators, and preserved across regeneration like the other user-owned
38
+ # config (see {TableConfig#merge}). Only referencers that are themselves scoped
39
+ # belong in `via`: an unconstrained referencer would project every row's id and
40
+ # union the whole table back, defeating the prune (such an arm is skipped with
41
+ # a warning rather than silently widening the dump).
42
+ class ReverseScope
43
+ include Serdes
44
+
45
+ attribute :via, array(ReverseScopeVia), default: []
46
+ end
47
+ end
@@ -39,6 +39,14 @@ module Exwiw
39
39
  attribute :scope_exempt, Serdes::OptionalType.new(Serdes::ConcreteType.new(Boolean)), skip_serializing_if_nil: true
40
40
  attribute :scope_column, optional(String), skip_serializing_if_nil: true
41
41
 
42
+ # `reverse_scope` opts a table into multi-referencer reverse scoping (see
43
+ # Exwiw::ReverseScope and QueryAstBuilder#build_referenced_by_clause): a
44
+ # global-identity table (e.g. `users`) referenced by many scoped tables is
45
+ # constrained to the UNION of those referencers' projected foreign keys
46
+ # instead of being dumped in full. User-configured and never emitted by the
47
+ # schema generators.
48
+ attribute :reverse_scope, Serdes::OptionalType.new(ReverseScope), skip_serializing_if_nil: true
49
+
42
50
  def self.from(hash)
43
51
  config = super
44
52
  config.send(:validate_after_load!)
@@ -58,6 +66,7 @@ module Exwiw
58
66
  if rails_managed?
59
67
  hash.delete("belongs_tos")
60
68
  hash.delete("columns")
69
+ hash.delete("reverse_scope")
61
70
  end
62
71
  hash
63
72
  end
@@ -152,6 +161,7 @@ module Exwiw
152
161
  # User-owned, never regenerated: carry over from the existing config.
153
162
  merged_table.scope_exempt = scope_exempt
154
163
  merged_table.scope_column = scope_column
164
+ merged_table.reverse_scope = reverse_scope
155
165
 
156
166
  # Structural facts of each belongs_to come from the freshly generated
157
167
  # config, but the user-owned `comment`/`ignore`/`ignore_type`/`references`
@@ -199,6 +209,10 @@ module Exwiw
199
209
  raise ArgumentError,
200
210
  "Table '#{name}' has type=#{type}; columns must not be defined."
201
211
  end
212
+ if reverse_scope
213
+ raise ArgumentError,
214
+ "Table '#{name}' has type=#{type}; reverse_scope must not be defined."
215
+ end
202
216
  else
203
217
  # An ignore:true table is not extracted, so primary_key is not required
204
218
  # (e.g. a composite-primary-key table that exwiw does not support).
data/lib/exwiw/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Exwiw
4
- VERSION = "0.8.1"
4
+ VERSION = "0.8.3"
5
5
  end
data/lib/exwiw.rb CHANGED
@@ -9,6 +9,7 @@ require_relative "exwiw/ext_json"
9
9
  require_relative "exwiw/config_file"
10
10
  require_relative "exwiw/belongs_to"
11
11
  require_relative "exwiw/table_column"
12
+ require_relative "exwiw/reverse_scope"
12
13
  require_relative "exwiw/table_config"
13
14
  require_relative "exwiw/embedded_in"
14
15
  require_relative "exwiw/mongodb_field"
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: exwiw
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.8.1
4
+ version: 0.8.3
5
5
  platform: ruby
6
6
  authors:
7
7
  - Shia
@@ -76,6 +76,7 @@ files:
76
76
  - lib/exwiw/query_ast.rb
77
77
  - lib/exwiw/query_ast_builder.rb
78
78
  - lib/exwiw/railtie.rb
79
+ - lib/exwiw/reverse_scope.rb
79
80
  - lib/exwiw/runner.rb
80
81
  - lib/exwiw/schema_generator.rb
81
82
  - lib/exwiw/table_column.rb