exwiw 0.9.0 → 0.9.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 171e3c591c208afcb89c2a9330c2c80d67f8f86ab370858b4690d03c799d28bc
4
- data.tar.gz: f913a31f6f8e2a29a9e4b2a296926c1266c17cc8655fd4cf33110899f2b468fd
3
+ metadata.gz: 577420a290ea24adc64ef6ba21914ac71b1c98f86163f9d1d8a88643cc89e15b
4
+ data.tar.gz: eaddb0073169543aa1bea944cb1b075fbd8258a034f4cffb9cf0011338ba260a
5
5
  SHA512:
6
- metadata.gz: a8d6e99e0881f4cdaa8d6d737cfbd0107382a6585f119df7d3068007ebb768a976b47fd59f78fb2d931ed95147c707523b3c8e5d1e9af0507584de93796e1ec1
7
- data.tar.gz: f1cbada5c1adb8892cc297a5e864051da3bdc0b7447cbe1c386b32dde4020cbf1ac6be94935d7bb547c2ec7f4482ebf79b3e1df796bb526db7fd7d5fb42ce662
6
+ metadata.gz: c21735b0dcda7b30b0519944c37f24610564e91801a29e2c0f7cc3632cc0f24c532d10c078127bccffd0a2f258a5695cdabe25094edf228b5e3203a629ddf175
7
+ data.tar.gz: 7e9ce9682b281c097485da409befc466dd06fc655c1612278b9e98f6bccdea575aa443ff4292777be1e4e68f88d826554702eecd6b94bddc31b1e4edc01c923b
data/CHANGELOG.md CHANGED
@@ -2,6 +2,12 @@
2
2
 
3
3
  ## [Unreleased]
4
4
 
5
+ ## [0.9.1] - 2026-06-30
6
+
7
+ ### Fixed
8
+
9
+ - **The `reverse_scope` satellite cascade now fires in single-target (`--target-table`) mode, not only scope-column mode.** A table that `belongs_to` a `reverse_scope`'d hub (a "satellite") is scoped by constraining it to the hub's in-scope ids — the multi-hop forward (`via_scoped_parent`) cascade. That cascade ran only in scope-column mode: in single-target / PK-anchor mode the hub itself was scoped via `reverse_scope`, but its `belongs_to` children fell through to a full dump — a silent cross-tenant export in a multi-tenant schema, despite the README promising the cascade works in "both single-target and scope-column mode." The single-target build now runs the same `build_belongs_to_scoped_clause` cascade for a non-target table that has no `belongs_to` path to the dump target, so satellites tighten to the kept ids (multi-hop, keeping the single-unambiguous-parent rule, polymorphic skip, and forward-path cycle guard); it can only narrow such a table's output, never widen it. Scope-column mode is unchanged — the cascade logic is reused as-is. Because single-target mode has no `validate_scope!` pre-flight, a satellite the cascade cannot resolve to a single scopable parent (e.g. it `belongs_to` two scopable hubs) is still dumped in full but now logs a warning. The README also documents that `referenced_by` scoping takes precedence over the hub cascade (which can under-scope) and how to force the hub cascade. SQL adapters only.
10
+
5
11
  ## [0.9.0] - 2026-06-30
6
12
 
7
13
  ### Added
data/README.md CHANGED
@@ -220,6 +220,13 @@ Each table is resolved as follows:
220
220
  path, set `ignore: true` to skip it, or mark it `scope_exempt: true` (below) to
221
221
  export it in full.
222
222
 
223
+ > **Note — referenced-by is preferred over the hub cascade.** A table that is
224
+ > *both* `belongs_to` a scoped hub *and* referenced-by a constrained child is
225
+ > scoped to the (narrower) referenced-by id-set, not the hub cascade, so the hub's
226
+ > other children the child does not reference are dropped (under-scoping). To force
227
+ > the broader hub cascade, set `ignore: true` on the child's `belongs_to` edge that
228
+ > points at this table.
229
+
223
230
  Scope-column mode is SQL-only (mysql / postgresql / sqlite). It works with `exwiw
224
231
  explain` too, which is the recommended way to preview the queries before exporting.
225
232
 
@@ -658,7 +665,7 @@ Notes:
658
665
  - **Only scoped referencers belong in `via`.** Each arm's query must come out constrained; an unconstrained referencer (e.g. a `scope_exempt` table, or one with no path to a scope) would project *every* id and union the whole table back — so such an arm is **skipped with a warning** rather than silently widening the dump. An unknown table is likewise skipped with a warning. If no arm survives, the table stays unscopable and (in [scope-column mode](#scope-column-mode)) the run aborts via `validate_scope!`.
659
666
  - **NULLs are excluded** per arm (`IS NOT NULL`).
660
667
  - **Satellites need no config.** A table that `belongs_to` the reverse-scoped table (e.g. `end_users.id → users.id`, or `identities.user_id → users.id`) tightens to the kept ids automatically through the normal cascade — only the reverse-scoped table itself declares `reverse_scope`. The cascade is **multi-hop**, so a table several `belongs_to` hops below the reverse-scoped table (e.g. `end_user_profiles → end_users → users`) also tightens automatically, with no config of its own.
661
- - Works in both single-target and scope-column mode. Polymorphic foreign keys are not eligible as anchors (the named `column` is always a concrete column).
668
+ - Works in both single-target and scope-column mode. In single-target mode there is no scope-column pre-flight (`validate_scope!`), so a satellite the cascade cannot resolve to a single scopable parent (e.g. it `belongs_to` two scopable hubs) is dumped in full with a warning rather than aborting. Polymorphic foreign keys are not eligible as anchors (the named `column` is always a concrete column).
662
669
 
663
670
  ### Why a JOIN, not `IN (subquery)`
664
671
 
@@ -86,6 +86,28 @@ module Exwiw
86
86
  where_clauses.push(reverse_clause) if reverse_clause
87
87
  end
88
88
 
89
+ # Forward cascade. A satellite of a reverse_scope'd (or referenced-by-scoped)
90
+ # hub has no belongs_to path to the dump target, so the clauses above stay
91
+ # empty and it would dump every row. When its belongs_to parent is itself
92
+ # scoped, constrain this table to the parent's in-scope ids — the same
93
+ # multi-hop cascade scope-column mode performs in build_scoped.
94
+ if table.name != dump_target.table_name &&
95
+ where_clauses.empty? && join_clauses.empty? &&
96
+ forward_scope_allowed?(table)
97
+ parent_clause = build_belongs_to_scoped_clause(table)
98
+ if parent_clause
99
+ where_clauses.push(parent_clause)
100
+ elsif @allow_reverse && @forward_path.empty? && !scope_exempt?(table) &&
101
+ scopable_parent_candidates(table).size > 1
102
+ @logger.warn(
103
+ " #{table.name} belongs_to multiple scopable parents; the cascade cannot " \
104
+ "pick one unambiguously, so it is dumped in full. If this is intended, set " \
105
+ "`scope_exempt: true`. Otherwise, scope it through a single parent (e.g. ignore one belongs_to edge), " \
106
+ "or switch to scope-column mode to scope it directly."
107
+ )
108
+ end
109
+ end
110
+
89
111
  QueryAst::Select.new.tap do |ast|
90
112
  ast.from(table.name)
91
113
  if table.rails_managed?
@@ -290,48 +312,18 @@ module Exwiw
290
312
  )
291
313
  end
292
314
 
293
- # Scope-column mode. Builds a `fk IN (SELECT parent.pk FROM <parent
294
- # extraction query>)` clause for a table whose belongs_to parent is itself
295
- # scopable but carries no scope column of its own — so find_path_to_scoped
296
- # cannot terminate on it (via_path fails) and nothing references this table
297
- # (referenced_by fails). The classic shape is a hub scoped only via
298
- # referenced_by (e.g. CDP `customer_accounts`, scoped by the `customers` that
299
- # reference it) with sibling detail tables (`customer_account_details`, ...)
300
- # hanging off it. Constraining those siblings to the hub's in-scope ids keeps
301
- # them out of a full dump. Returns nil when there is no single, unambiguous
302
- # scopable parent, leaving the caller on the unscopable path.
315
+ # Builds a `fk IN (SELECT parent.pk FROM <parent extraction query>)` clause
316
+ # for a table whose belongs_to parent is itself scopable but carries no scope
317
+ # column of its own — so find_path_to_scoped cannot terminate on it (via_path
318
+ # fails) and nothing references this table (referenced_by fails). The classic
319
+ # shape is a hub scoped only via referenced_by (e.g. CDP `customer_accounts`,
320
+ # scoped by the `customers` that reference it) with sibling detail tables
321
+ # (`customer_account_details`, ...) hanging off it. Constraining those
322
+ # siblings to the hub's in-scope ids keeps them out of a full dump. Returns
323
+ # nil when there is no single, unambiguous scopable parent, leaving the caller
324
+ # on the unscopable path. Used by both scope-column and single-target mode.
303
325
  private def build_belongs_to_scoped_clause(table)
304
- # This table plus every ancestor currently being forward-resolved. A
305
- # candidate parent already on this path would close a belongs_to cycle, so
306
- # it is skipped; threading the grown path into the parent build lets the
307
- # cascade recurse N hops (users -> end_users -> end_user_profiles -> ...)
308
- # and terminate only when a table reappears.
309
- forward_path = @forward_path + [table.name]
310
-
311
- candidates = table.belongs_tos.filter_map do |relation|
312
- # A polymorphic belongs_to points at several parent tables through one
313
- # column, so it cannot project to a single parent id set; skip it.
314
- next if relation.polymorphic?
315
-
316
- parent = table_by_name[relation.table_name]
317
- next if parent.nil?
318
-
319
- # Cycle guard: descending into a parent already on the forward path would
320
- # loop (a -> b -> a). Stop, leaving this table on the :unscopable path.
321
- next if forward_path.include?(parent.name)
322
-
323
- # Build the parent's own scoped query. allow_reverse stays true so the
324
- # parent may be scoped via referenced_by, and forward scoping stays
325
- # enabled so a parent that is itself scoped via *its* parent resolves
326
- # too — this is what makes the cascade multi-hop.
327
- parent_query = self.class.run(parent.name, table_by_name, dump_target, @logger, allow_reverse: true, forward_path: forward_path)
328
-
329
- # Only a constrained parent narrows anything; an unconstrained parent
330
- # would select every pk (i.e. dump all) and not help.
331
- next unless parent_query.where_clauses.any? || parent_query.join_clauses.any?
332
-
333
- [relation, parent, parent_query]
334
- end
326
+ candidates = scopable_parent_candidates(table)
335
327
 
336
328
  # Only the unambiguous single-parent case. Multiple scopable parents would
337
329
  # need their subqueries combined (not supported); fall back to unscopable.
@@ -360,6 +352,45 @@ module Exwiw
360
352
  )
361
353
  end
362
354
 
355
+ # The scopable belongs_to parents of `table`: each non-polymorphic parent
356
+ # whose own extraction query comes out constrained, paired with the relation
357
+ # and that query. Shared by build_belongs_to_scoped_clause (which requires
358
+ # exactly one) and the single-target full-dump warning (which flags two or
359
+ # more, since the cascade then cannot disambiguate).
360
+ private def scopable_parent_candidates(table)
361
+ # Memoized: #run can resolve this twice for the same table (once via
362
+ # build_belongs_to_scoped_clause, once for the ambiguous-parent warning),
363
+ # and each pass recursively builds every parent's query.
364
+ (@scopable_parent_candidates ||= {})[table.name] ||= begin
365
+ # This table plus every ancestor currently being forward-resolved; a
366
+ # candidate parent already on this path would close a belongs_to cycle, so
367
+ # it is skipped. Threading the grown path into the parent build lets the
368
+ # cascade recurse N hops and terminate only when a table reappears.
369
+ forward_path = @forward_path + [table.name]
370
+
371
+ table.belongs_tos.filter_map do |relation|
372
+ # A polymorphic belongs_to points at several parent tables through one
373
+ # column, so it cannot project to a single parent id set.
374
+ next if relation.polymorphic?
375
+
376
+ parent = table_by_name[relation.table_name]
377
+ next if parent.nil?
378
+ next if forward_path.include?(parent.name)
379
+
380
+ # allow_reverse and forward scoping stay enabled so the parent may itself
381
+ # be scoped via referenced_by or via *its* parent — this is what makes the
382
+ # cascade multi-hop.
383
+ parent_query = self.class.run(parent.name, table_by_name, dump_target, @logger, allow_reverse: true, forward_path: forward_path)
384
+
385
+ # Only a constrained parent narrows anything; an unconstrained parent
386
+ # would select every pk (i.e. dump all) and not help.
387
+ next unless parent_query.where_clauses.any? || parent_query.join_clauses.any?
388
+
389
+ [relation, parent, parent_query]
390
+ end
391
+ end
392
+ end
393
+
363
394
  private def build_where_clauses(table, dump_target)
364
395
  clauses = []
365
396
 
data/lib/exwiw/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Exwiw
4
- VERSION = "0.9.0"
4
+ VERSION = "0.9.1"
5
5
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: exwiw
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.9.0
4
+ version: 0.9.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Shia