exwiw 0.9.0 → 0.9.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +6 -0
- data/README.md +8 -1
- data/lib/exwiw/query_ast_builder.rb +72 -41
- data/lib/exwiw/version.rb +1 -1
- metadata +1 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 577420a290ea24adc64ef6ba21914ac71b1c98f86163f9d1d8a88643cc89e15b
|
|
4
|
+
data.tar.gz: eaddb0073169543aa1bea944cb1b075fbd8258a034f4cffb9cf0011338ba260a
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: c21735b0dcda7b30b0519944c37f24610564e91801a29e2c0f7cc3632cc0f24c532d10c078127bccffd0a2f258a5695cdabe25094edf228b5e3203a629ddf175
|
|
7
|
+
data.tar.gz: 7e9ce9682b281c097485da409befc466dd06fc655c1612278b9e98f6bccdea575aa443ff4292777be1e4e68f88d826554702eecd6b94bddc31b1e4edc01c923b
|
data/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,12 @@
|
|
|
2
2
|
|
|
3
3
|
## [Unreleased]
|
|
4
4
|
|
|
5
|
+
## [0.9.1] - 2026-06-30
|
|
6
|
+
|
|
7
|
+
### Fixed
|
|
8
|
+
|
|
9
|
+
- **The `reverse_scope` satellite cascade now fires in single-target (`--target-table`) mode, not only scope-column mode.** A table that `belongs_to` a `reverse_scope`'d hub (a "satellite") is scoped by constraining it to the hub's in-scope ids — the multi-hop forward (`via_scoped_parent`) cascade. That cascade ran only in scope-column mode: in single-target / PK-anchor mode the hub itself was scoped via `reverse_scope`, but its `belongs_to` children fell through to a full dump — a silent cross-tenant export in a multi-tenant schema, despite the README promising the cascade works in "both single-target and scope-column mode." The single-target build now runs the same `build_belongs_to_scoped_clause` cascade for a non-target table that has no `belongs_to` path to the dump target, so satellites tighten to the kept ids (multi-hop, keeping the single-unambiguous-parent rule, polymorphic skip, and forward-path cycle guard); it can only narrow such a table's output, never widen it. Scope-column mode is unchanged — the cascade logic is reused as-is. Because single-target mode has no `validate_scope!` pre-flight, a satellite the cascade cannot resolve to a single scopable parent (e.g. it `belongs_to` two scopable hubs) is still dumped in full but now logs a warning. The README also documents that `referenced_by` scoping takes precedence over the hub cascade (which can under-scope) and how to force the hub cascade. SQL adapters only.
|
|
10
|
+
|
|
5
11
|
## [0.9.0] - 2026-06-30
|
|
6
12
|
|
|
7
13
|
### Added
|
data/README.md
CHANGED
|
@@ -220,6 +220,13 @@ Each table is resolved as follows:
|
|
|
220
220
|
path, set `ignore: true` to skip it, or mark it `scope_exempt: true` (below) to
|
|
221
221
|
export it in full.
|
|
222
222
|
|
|
223
|
+
> **Note — referenced-by is preferred over the hub cascade.** A table that is
|
|
224
|
+
> *both* `belongs_to` a scoped hub *and* referenced-by a constrained child is
|
|
225
|
+
> scoped to the (narrower) referenced-by id-set, not the hub cascade, so the hub's
|
|
226
|
+
> other children the child does not reference are dropped (under-scoping). To force
|
|
227
|
+
> the broader hub cascade, set `ignore: true` on the child's `belongs_to` edge that
|
|
228
|
+
> points at this table.
|
|
229
|
+
|
|
223
230
|
Scope-column mode is SQL-only (mysql / postgresql / sqlite). It works with `exwiw
|
|
224
231
|
explain` too, which is the recommended way to preview the queries before exporting.
|
|
225
232
|
|
|
@@ -658,7 +665,7 @@ Notes:
|
|
|
658
665
|
- **Only scoped referencers belong in `via`.** Each arm's query must come out constrained; an unconstrained referencer (e.g. a `scope_exempt` table, or one with no path to a scope) would project *every* id and union the whole table back — so such an arm is **skipped with a warning** rather than silently widening the dump. An unknown table is likewise skipped with a warning. If no arm survives, the table stays unscopable and (in [scope-column mode](#scope-column-mode)) the run aborts via `validate_scope!`.
|
|
659
666
|
- **NULLs are excluded** per arm (`IS NOT NULL`).
|
|
660
667
|
- **Satellites need no config.** A table that `belongs_to` the reverse-scoped table (e.g. `end_users.id → users.id`, or `identities.user_id → users.id`) tightens to the kept ids automatically through the normal cascade — only the reverse-scoped table itself declares `reverse_scope`. The cascade is **multi-hop**, so a table several `belongs_to` hops below the reverse-scoped table (e.g. `end_user_profiles → end_users → users`) also tightens automatically, with no config of its own.
|
|
661
|
-
- Works in both single-target and scope-column mode. Polymorphic foreign keys are not eligible as anchors (the named `column` is always a concrete column).
|
|
668
|
+
- Works in both single-target and scope-column mode. In single-target mode there is no scope-column pre-flight (`validate_scope!`), so a satellite the cascade cannot resolve to a single scopable parent (e.g. it `belongs_to` two scopable hubs) is dumped in full with a warning rather than aborting. Polymorphic foreign keys are not eligible as anchors (the named `column` is always a concrete column).
|
|
662
669
|
|
|
663
670
|
### Why a JOIN, not `IN (subquery)`
|
|
664
671
|
|
|
@@ -86,6 +86,28 @@ module Exwiw
|
|
|
86
86
|
where_clauses.push(reverse_clause) if reverse_clause
|
|
87
87
|
end
|
|
88
88
|
|
|
89
|
+
# Forward cascade. A satellite of a reverse_scope'd (or referenced-by-scoped)
|
|
90
|
+
# hub has no belongs_to path to the dump target, so the clauses above stay
|
|
91
|
+
# empty and it would dump every row. When its belongs_to parent is itself
|
|
92
|
+
# scoped, constrain this table to the parent's in-scope ids — the same
|
|
93
|
+
# multi-hop cascade scope-column mode performs in build_scoped.
|
|
94
|
+
if table.name != dump_target.table_name &&
|
|
95
|
+
where_clauses.empty? && join_clauses.empty? &&
|
|
96
|
+
forward_scope_allowed?(table)
|
|
97
|
+
parent_clause = build_belongs_to_scoped_clause(table)
|
|
98
|
+
if parent_clause
|
|
99
|
+
where_clauses.push(parent_clause)
|
|
100
|
+
elsif @allow_reverse && @forward_path.empty? && !scope_exempt?(table) &&
|
|
101
|
+
scopable_parent_candidates(table).size > 1
|
|
102
|
+
@logger.warn(
|
|
103
|
+
" #{table.name} belongs_to multiple scopable parents; the cascade cannot " \
|
|
104
|
+
"pick one unambiguously, so it is dumped in full. If this is intended, set " \
|
|
105
|
+
"`scope_exempt: true`. Otherwise, scope it through a single parent (e.g. ignore one belongs_to edge), " \
|
|
106
|
+
"or switch to scope-column mode to scope it directly."
|
|
107
|
+
)
|
|
108
|
+
end
|
|
109
|
+
end
|
|
110
|
+
|
|
89
111
|
QueryAst::Select.new.tap do |ast|
|
|
90
112
|
ast.from(table.name)
|
|
91
113
|
if table.rails_managed?
|
|
@@ -290,48 +312,18 @@ module Exwiw
|
|
|
290
312
|
)
|
|
291
313
|
end
|
|
292
314
|
|
|
293
|
-
#
|
|
294
|
-
#
|
|
295
|
-
#
|
|
296
|
-
#
|
|
297
|
-
#
|
|
298
|
-
#
|
|
299
|
-
#
|
|
300
|
-
#
|
|
301
|
-
#
|
|
302
|
-
#
|
|
315
|
+
# Builds a `fk IN (SELECT parent.pk FROM <parent extraction query>)` clause
|
|
316
|
+
# for a table whose belongs_to parent is itself scopable but carries no scope
|
|
317
|
+
# column of its own — so find_path_to_scoped cannot terminate on it (via_path
|
|
318
|
+
# fails) and nothing references this table (referenced_by fails). The classic
|
|
319
|
+
# shape is a hub scoped only via referenced_by (e.g. CDP `customer_accounts`,
|
|
320
|
+
# scoped by the `customers` that reference it) with sibling detail tables
|
|
321
|
+
# (`customer_account_details`, ...) hanging off it. Constraining those
|
|
322
|
+
# siblings to the hub's in-scope ids keeps them out of a full dump. Returns
|
|
323
|
+
# nil when there is no single, unambiguous scopable parent, leaving the caller
|
|
324
|
+
# on the unscopable path. Used by both scope-column and single-target mode.
|
|
303
325
|
private def build_belongs_to_scoped_clause(table)
|
|
304
|
-
|
|
305
|
-
# candidate parent already on this path would close a belongs_to cycle, so
|
|
306
|
-
# it is skipped; threading the grown path into the parent build lets the
|
|
307
|
-
# cascade recurse N hops (users -> end_users -> end_user_profiles -> ...)
|
|
308
|
-
# and terminate only when a table reappears.
|
|
309
|
-
forward_path = @forward_path + [table.name]
|
|
310
|
-
|
|
311
|
-
candidates = table.belongs_tos.filter_map do |relation|
|
|
312
|
-
# A polymorphic belongs_to points at several parent tables through one
|
|
313
|
-
# column, so it cannot project to a single parent id set; skip it.
|
|
314
|
-
next if relation.polymorphic?
|
|
315
|
-
|
|
316
|
-
parent = table_by_name[relation.table_name]
|
|
317
|
-
next if parent.nil?
|
|
318
|
-
|
|
319
|
-
# Cycle guard: descending into a parent already on the forward path would
|
|
320
|
-
# loop (a -> b -> a). Stop, leaving this table on the :unscopable path.
|
|
321
|
-
next if forward_path.include?(parent.name)
|
|
322
|
-
|
|
323
|
-
# Build the parent's own scoped query. allow_reverse stays true so the
|
|
324
|
-
# parent may be scoped via referenced_by, and forward scoping stays
|
|
325
|
-
# enabled so a parent that is itself scoped via *its* parent resolves
|
|
326
|
-
# too — this is what makes the cascade multi-hop.
|
|
327
|
-
parent_query = self.class.run(parent.name, table_by_name, dump_target, @logger, allow_reverse: true, forward_path: forward_path)
|
|
328
|
-
|
|
329
|
-
# Only a constrained parent narrows anything; an unconstrained parent
|
|
330
|
-
# would select every pk (i.e. dump all) and not help.
|
|
331
|
-
next unless parent_query.where_clauses.any? || parent_query.join_clauses.any?
|
|
332
|
-
|
|
333
|
-
[relation, parent, parent_query]
|
|
334
|
-
end
|
|
326
|
+
candidates = scopable_parent_candidates(table)
|
|
335
327
|
|
|
336
328
|
# Only the unambiguous single-parent case. Multiple scopable parents would
|
|
337
329
|
# need their subqueries combined (not supported); fall back to unscopable.
|
|
@@ -360,6 +352,45 @@ module Exwiw
|
|
|
360
352
|
)
|
|
361
353
|
end
|
|
362
354
|
|
|
355
|
+
# The scopable belongs_to parents of `table`: each non-polymorphic parent
|
|
356
|
+
# whose own extraction query comes out constrained, paired with the relation
|
|
357
|
+
# and that query. Shared by build_belongs_to_scoped_clause (which requires
|
|
358
|
+
# exactly one) and the single-target full-dump warning (which flags two or
|
|
359
|
+
# more, since the cascade then cannot disambiguate).
|
|
360
|
+
private def scopable_parent_candidates(table)
|
|
361
|
+
# Memoized: #run can resolve this twice for the same table (once via
|
|
362
|
+
# build_belongs_to_scoped_clause, once for the ambiguous-parent warning),
|
|
363
|
+
# and each pass recursively builds every parent's query.
|
|
364
|
+
(@scopable_parent_candidates ||= {})[table.name] ||= begin
|
|
365
|
+
# This table plus every ancestor currently being forward-resolved; a
|
|
366
|
+
# candidate parent already on this path would close a belongs_to cycle, so
|
|
367
|
+
# it is skipped. Threading the grown path into the parent build lets the
|
|
368
|
+
# cascade recurse N hops and terminate only when a table reappears.
|
|
369
|
+
forward_path = @forward_path + [table.name]
|
|
370
|
+
|
|
371
|
+
table.belongs_tos.filter_map do |relation|
|
|
372
|
+
# A polymorphic belongs_to points at several parent tables through one
|
|
373
|
+
# column, so it cannot project to a single parent id set.
|
|
374
|
+
next if relation.polymorphic?
|
|
375
|
+
|
|
376
|
+
parent = table_by_name[relation.table_name]
|
|
377
|
+
next if parent.nil?
|
|
378
|
+
next if forward_path.include?(parent.name)
|
|
379
|
+
|
|
380
|
+
# allow_reverse and forward scoping stay enabled so the parent may itself
|
|
381
|
+
# be scoped via referenced_by or via *its* parent — this is what makes the
|
|
382
|
+
# cascade multi-hop.
|
|
383
|
+
parent_query = self.class.run(parent.name, table_by_name, dump_target, @logger, allow_reverse: true, forward_path: forward_path)
|
|
384
|
+
|
|
385
|
+
# Only a constrained parent narrows anything; an unconstrained parent
|
|
386
|
+
# would select every pk (i.e. dump all) and not help.
|
|
387
|
+
next unless parent_query.where_clauses.any? || parent_query.join_clauses.any?
|
|
388
|
+
|
|
389
|
+
[relation, parent, parent_query]
|
|
390
|
+
end
|
|
391
|
+
end
|
|
392
|
+
end
|
|
393
|
+
|
|
363
394
|
private def build_where_clauses(table, dump_target)
|
|
364
395
|
clauses = []
|
|
365
396
|
|
data/lib/exwiw/version.rb
CHANGED