data_shifter 0.1.0 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.husky/pre-commit +0 -3
- data/CHANGELOG.md +39 -0
- data/README.md +158 -46
- data/lib/data_shifter/configuration.rb +42 -0
- data/lib/data_shifter/errors.rb +46 -0
- data/lib/data_shifter/internal/colors.rb +71 -0
- data/lib/data_shifter/internal/env.rb +8 -6
- data/lib/data_shifter/internal/log_deduplicator.rb +149 -0
- data/lib/data_shifter/internal/output.rb +118 -69
- data/lib/data_shifter/internal/side_effect_guards.rb +120 -0
- data/lib/data_shifter/shift.rb +212 -23
- data/lib/data_shifter/version.rb +1 -1
- data/lib/data_shifter.rb +21 -0
- data/lib/generators/data_shift_generator.rb +90 -13
- metadata +21 -3
- data/lib/data_shifter/rubocop.rb +0 -4
- data/lib/rubocop/cop/data_shifter/skip_transaction_guard_dry_run.rb +0 -55
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 7c2c6cb1c13bba3100efe294ceeafd838f4c329650b323457b0a75296bbff28f
|
|
4
|
+
data.tar.gz: b2dfe9b104bcc97fe7f8d1524d9739cb6b5918885a64f0ccf58725a8477d4298
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: c501bef9515dae1a53a20dd4b40e4fb454c31b17a781b53df02b861f6ae0415f012a801f2b767f5975cd8cb5aa69555467de5a89dff4448b766ab410e23e7a5a
|
|
7
|
+
data.tar.gz: cd30ac7dcef93f800101c26a5472063859e8671270c58169083c7116fdd3cac2f6559efe6f9da084a2b7b939aa2ef0e909a96b816964043f8739d544a80fef34
|
data/.husky/pre-commit
CHANGED
data/CHANGELOG.md
ADDED
|
@@ -0,0 +1,39 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
## [Unreleased]
|
|
4
|
+
|
|
5
|
+
* N/A
|
|
6
|
+
|
|
7
|
+
## [0.3.0]
|
|
8
|
+
|
|
9
|
+
### Added
|
|
10
|
+
|
|
11
|
+
- **Task-based shifts**: New `task` DSL for targeted, one-off changes without the `collection`/`process_record` pattern. Define one or more `task "label" do ... end` blocks that run in sequence with shared transaction and dry-run semantics. Labels appear in output and error messages.
|
|
12
|
+
- **Generator `--task` option**: `rails g data_shift fix_order_1234 --task` generates a shift with a `task` block instead of `collection`/`process_record`.
|
|
13
|
+
- **Colorized CLI output**: Headers, summaries, and status output now use ANSI colors for better readability. Colors are automatically disabled when output is not a TTY or when `NO_COLOR` environment variable is set.
|
|
14
|
+
- **Cleaner summaries**: `Failed` and `Skipped` lines are now omitted from summaries when their values are zero.
|
|
15
|
+
|
|
16
|
+
### Changed
|
|
17
|
+
|
|
18
|
+
- **Improved error messages**: `NotImplementedError` messages for `collection` and `process_record` now suggest using `task` blocks as an alternative.
|
|
19
|
+
- **Task labels logged on execution**: When running task-based shifts, each labeled task logs its name (`>> label`) when it starts.
|
|
20
|
+
|
|
21
|
+
## [0.2.0]
|
|
22
|
+
|
|
23
|
+
### Added
|
|
24
|
+
|
|
25
|
+
- **Configuration object**: New `DataShifter.configure` block for global settings.
|
|
26
|
+
- **Dry-run rollback for `transaction false`**: Shifts using `transaction false` (or `:none`) now roll back DB changes in dry-run mode, matching the behavior of other transaction modes.
|
|
27
|
+
- **Automatic side-effect guards in dry run**: When a shift runs in dry run mode, HTTP (via WebMock), ActionMailer, ActiveJob, and Sidekiq (if loaded) are now automatically blocked or faked so that unguarded external calls do not run. Restore happens in an `ensure` so state is reverted after the run.
|
|
28
|
+
- **HTTP**: All outbound requests are blocked unless allowed with the per-shift `allow_external_requests [...]` DSL or global `DataShifter.config.allow_external_requests`.
|
|
29
|
+
- **ActionMailer**: `perform_deliveries = false` for the duration of the dry run.
|
|
30
|
+
- **ActiveJob**: Queue adapter set to `:test` for the duration of the dry run.
|
|
31
|
+
- **Sidekiq**: `Sidekiq::Testing.fake!` for the duration of the dry run (only if `Sidekiq::Testing` is already loaded).
|
|
32
|
+
- Dependency on `webmock` (>= 3.18) for dry-run HTTP blocking.
|
|
33
|
+
- **Log deduplication**: Repeated log messages are now suppressed during shift runs (default: on). First occurrence logs normally; subsequent occurrences are counted and a summary is printed at the end. Configure globally with `config.suppress_repeated_logs` and `config.repeated_log_cap` (default 1000). Override per-shift with `suppress_repeated_logs false`.
|
|
34
|
+
- **Global progress bar default**: `config.progress_enabled` (default `true`) sets the default for all shifts. Per-shift `progress true/false` still overrides.
|
|
35
|
+
- **Global status interval**: `config.status_interval_seconds` (default `nil`) provides a fallback when `STATUS_INTERVAL` env var is not set.
|
|
36
|
+
- **skip! abort behavior**: `skip!` now terminates the current `process_record` (no `return` needed after calling it).
|
|
37
|
+
- **Grouped skip reasons**: Skip reasons are grouped and the top 10 (by count) are shown in the summary and status output instead of logging each skip inline.
|
|
38
|
+
|
|
39
|
+
## [0.1.0] - Initial release
|
data/README.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# DataShifter
|
|
2
2
|
|
|
3
|
-
Rake-backed data migrations (
|
|
3
|
+
Rake-backed data migrations ("shifts") for Rails apps, with **dry run by default**, progress output, and a consistent summary. Define shift classes in `lib/data_shifts/*.rb`; run them as `rake data:shift:<task_name>`.
|
|
4
4
|
|
|
5
5
|
## Installation
|
|
6
6
|
|
|
@@ -21,7 +21,7 @@ Generate a shift (optionally scoped to a model):
|
|
|
21
21
|
|
|
22
22
|
```bash
|
|
23
23
|
bin/rails generate data_shift backfill_foo
|
|
24
|
-
bin/rails generate data_shift backfill_users --model
|
|
24
|
+
bin/rails generate data_shift backfill_users --model User
|
|
25
25
|
```
|
|
26
26
|
|
|
27
27
|
Add your logic to the generated file in `lib/data_shifts/`.
|
|
@@ -33,22 +33,11 @@ rake data:shift:backfill_foo
|
|
|
33
33
|
COMMIT=1 rake data:shift:backfill_foo
|
|
34
34
|
```
|
|
35
35
|
|
|
36
|
-
## How shift files map to rake tasks
|
|
37
|
-
|
|
38
|
-
DataShifter defines one rake task per file in `lib/data_shifts/*.rb`.
|
|
39
|
-
|
|
40
|
-
- **Task name**: derived from the filename with any leading digits removed.
|
|
41
|
-
- `20260201120000_backfill_foo.rb` → `data:shift:backfill_foo` (leading `<digits>_` prefix is stripped)
|
|
42
|
-
- `backfill_foo.rb` → `data:shift:backfill_foo`
|
|
43
|
-
- **Class name**: task name camelized, inside the `DataShifts` module.
|
|
44
|
-
- `backfill_foo` → `DataShifts::BackfillFoo`
|
|
45
|
-
|
|
46
|
-
Shift files are **required only when the task runs** (tasks are defined up front; classes load lazily).
|
|
47
|
-
The `description "..."` line is extracted from the file and used for `rake -T` output without loading the shift class.
|
|
48
|
-
|
|
49
36
|
## Defining a shift
|
|
50
37
|
|
|
51
|
-
|
|
38
|
+
### Collection-based shifts (typical)
|
|
39
|
+
|
|
40
|
+
For systemic migrations across many records, implement:
|
|
52
41
|
|
|
53
42
|
- **`collection`**: an `ActiveRecord::Relation` (uses `find_each`) or an `Array`/Enumerable
|
|
54
43
|
- **`process_record(record)`**: applies the change for one record
|
|
@@ -69,15 +58,86 @@ module DataShifts
|
|
|
69
58
|
end
|
|
70
59
|
```
|
|
71
60
|
|
|
61
|
+
### Task-based shifts (targeted, one-off changes)
|
|
62
|
+
|
|
63
|
+
For targeted changes to specific records (e.g. fixing a bug for particular IDs), use `task` blocks instead:
|
|
64
|
+
|
|
65
|
+
```ruby
|
|
66
|
+
module DataShifts
|
|
67
|
+
class FixOrderDiscrepancies < DataShifter::Shift
|
|
68
|
+
description "Fix order #1234 shipping and billing issues"
|
|
69
|
+
|
|
70
|
+
task "Correct shipping address" do
|
|
71
|
+
order.update!(shipping_address: "123 Main St")
|
|
72
|
+
end
|
|
73
|
+
|
|
74
|
+
task "Apply missing discount" do
|
|
75
|
+
order.update!(discount_cents: 500)
|
|
76
|
+
end
|
|
77
|
+
|
|
78
|
+
private
|
|
79
|
+
|
|
80
|
+
def order
|
|
81
|
+
@order ||= Order.find(1234)
|
|
82
|
+
end
|
|
83
|
+
end
|
|
84
|
+
end
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
Task blocks run in the context of the shift instance, so they have access to private helper methods, `dry_run?`, `log`, `skip!`, `find_exactly!`, and any other instance methods you define. Use private methods to DRY up shared lookups across tasks.
|
|
88
|
+
|
|
89
|
+
Task blocks:
|
|
90
|
+
|
|
91
|
+
- Run in sequence within the same lifecycle (transaction, dry run protection, summary)
|
|
92
|
+
- Default to single transaction (all tasks commit or roll back together); use `transaction :per_record` for per-task transactions
|
|
93
|
+
|
|
94
|
+
Generate a task-based shift with:
|
|
95
|
+
|
|
96
|
+
```bash
|
|
97
|
+
bin/rails generate data_shift fix_order_1234 --task
|
|
98
|
+
```
|
|
99
|
+
|
|
72
100
|
## Dry run vs commit
|
|
73
101
|
|
|
74
|
-
Shifts run in **dry run** mode by default.
|
|
102
|
+
Shifts run in **dry run** mode by default. DB changes are always rolled back in dry run mode, regardless of transaction setting.
|
|
75
103
|
|
|
76
104
|
- **Dry run (default)**: `rake data:shift:backfill_foo`
|
|
77
105
|
- **Commit**: `COMMIT=1 rake data:shift:backfill_foo`
|
|
78
106
|
- (`COMMIT=true` or `DRY_RUN=false` also commit)
|
|
79
107
|
|
|
80
|
-
|
|
108
|
+
### Automatic side-effect guards (dry run)
|
|
109
|
+
|
|
110
|
+
In **dry run** mode, DataShifter automatically blocks or fakes these side effects so unguarded code is less likely to hit the network or send mail/jobs:
|
|
111
|
+
|
|
112
|
+
| Service | Behavior in dry run |
|
|
113
|
+
|-------------|----------------------|
|
|
114
|
+
| **HTTP** | Blocked via WebMock (`disable_net_connect!`). Allow specific hosts with `allow_external_requests [...]` or `DataShifter.config.allow_external_requests`. |
|
|
115
|
+
| **ActionMailer** | `perform_deliveries = false` (restored after run). |
|
|
116
|
+
| **ActiveJob** | Queue adapter set to `:test` (restored after run). |
|
|
117
|
+
| **Sidekiq** | `Sidekiq::Testing.fake!` (restored with `disable!` after run). Only applied if `Sidekiq::Testing` is already loaded. |
|
|
118
|
+
|
|
119
|
+
**Guarding other side effects:** For anything we don't cover (e.g. another service, or allowed HTTP that mutates), use e.g. `return if dry_run?` in your shift. DB changes are always rolled back in dry run; only non-DB side effects need this.
|
|
120
|
+
|
|
121
|
+
To allow HTTP to specific hosts during dry run (e.g. a migration that must call an API to compute values), use the per-shift DSL or global config (NOTE: it is your responsibility to ensure you only make readonly requests in `dry_run?` mode):
|
|
122
|
+
|
|
123
|
+
```ruby
|
|
124
|
+
# Per shift
|
|
125
|
+
module DataShifts
|
|
126
|
+
class BackfillFromApi < DataShifter::Shift
|
|
127
|
+
allow_external_requests ["api.readonly.example.com", %r{\.internal\.company\z}]
|
|
128
|
+
# ...
|
|
129
|
+
end
|
|
130
|
+
end
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
```ruby
|
|
134
|
+
# Global (e.g. in config/initializers/data_shifter.rb)
|
|
135
|
+
DataShifter.configure do |config|
|
|
136
|
+
config.allow_external_requests = ["api.readonly.example.com"]
|
|
137
|
+
end
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
Allowed hosts are combined (per-shift + global). Restore (WebMock, mail, jobs) happens in an `ensure` so later code and other specs are unaffected.
|
|
81
141
|
|
|
82
142
|
## Transaction modes
|
|
83
143
|
|
|
@@ -85,7 +145,7 @@ Set the transaction mode at the class level:
|
|
|
85
145
|
|
|
86
146
|
- **`transaction :single` / `transaction true` (default)**: one DB transaction for the entire run; dry run rolls back at the end; a record error aborts the run.
|
|
87
147
|
- **`transaction :per_record`**: in commit mode, each record runs in its own transaction (errors are collected and the run continues); in dry run, the run is wrapped in a single rollback transaction.
|
|
88
|
-
- **`transaction false` / `transaction :none`**:
|
|
148
|
+
- **`transaction false` / `transaction :none`**: No automatic transaction in **commit** mode only. In dry run, the run is still wrapped in a single rollback transaction so DB changes are never committed. Use when you have external side effects or your own transaction strategy in commit mode.
|
|
89
149
|
|
|
90
150
|
```ruby
|
|
91
151
|
module DataShifts
|
|
@@ -137,7 +197,53 @@ CONTINUE_FROM=123 COMMIT=1 rake data:shift:backfill_foo
|
|
|
137
197
|
Notes:
|
|
138
198
|
|
|
139
199
|
- Only supported for `ActiveRecord::Relation` collections (Array-based collections—like those from `find_exactly!`—cannot be resumed).
|
|
140
|
-
- The filter is `primary_key > CONTINUE_FROM`, so it
|
|
200
|
+
- The filter is `primary_key > CONTINUE_FROM`, so it's only useful with monotonically increasing primary keys (e.g. `find_each`'s default behavior).
|
|
201
|
+
|
|
202
|
+
## How shift files map to rake tasks
|
|
203
|
+
|
|
204
|
+
DataShifter defines one rake task per file in `lib/data_shifts/*.rb`.
|
|
205
|
+
|
|
206
|
+
- **Task name**: derived from the filename with any leading digits removed.
|
|
207
|
+
- `20260201120000_backfill_foo.rb` → `data:shift:backfill_foo` (leading `<digits>_` prefix is stripped)
|
|
208
|
+
- `backfill_foo.rb` → `data:shift:backfill_foo`
|
|
209
|
+
- **Class name**: task name camelized, inside the `DataShifts` module.
|
|
210
|
+
- `backfill_foo` → `DataShifts::BackfillFoo`
|
|
211
|
+
|
|
212
|
+
Shift files are **required only when the task runs** (tasks are defined up front; classes load lazily).
|
|
213
|
+
The `description "..."` line is extracted from the file and used for `rake -T` output without loading the shift class.
|
|
214
|
+
|
|
215
|
+
## Configuration
|
|
216
|
+
|
|
217
|
+
Configure DataShifter globally in an initializer:
|
|
218
|
+
|
|
219
|
+
```ruby
|
|
220
|
+
# config/initializers/data_shifter.rb
|
|
221
|
+
DataShifter.configure do |config|
|
|
222
|
+
# Hosts allowed for HTTP during dry run only (no effect in commit mode)
|
|
223
|
+
config.allow_external_requests = ["api.readonly.example.com"]
|
|
224
|
+
|
|
225
|
+
# Suppress repeated log messages during a shift run (default: true)
|
|
226
|
+
config.suppress_repeated_logs = true
|
|
227
|
+
|
|
228
|
+
# Max unique messages to track for deduplication (default: 1000)
|
|
229
|
+
config.repeated_log_cap = 1000
|
|
230
|
+
|
|
231
|
+
# Global default for progress bar visibility (default: true)
|
|
232
|
+
config.progress_enabled = true
|
|
233
|
+
|
|
234
|
+
# Default status print interval in seconds when ENV STATUS_INTERVAL is not set (default: nil)
|
|
235
|
+
config.status_interval_seconds = nil
|
|
236
|
+
end
|
|
237
|
+
```
|
|
238
|
+
|
|
239
|
+
Per-shift overrides:
|
|
240
|
+
|
|
241
|
+
```ruby
|
|
242
|
+
class MyShift < DataShifter::Shift
|
|
243
|
+
progress false # Disable progress bar for this shift
|
|
244
|
+
suppress_repeated_logs false # Disable log deduplication for this shift
|
|
245
|
+
end
|
|
246
|
+
```
|
|
141
247
|
|
|
142
248
|
## Operational tips
|
|
143
249
|
|
|
@@ -145,7 +251,7 @@ Notes:
|
|
|
145
251
|
|
|
146
252
|
- **Start with a dry run**: run the task once with no environment variables set, confirm logs and summary look right, then re-run with `COMMIT=1`.
|
|
147
253
|
- **Make shifts idempotent**: structure `process_record` so re-running is safe (for example, update only when the target column is `NULL`, or compute the same derived value deterministically).
|
|
148
|
-
- **Guard side effects
|
|
254
|
+
- **Guard side effects we don't auto-block**: use `return if dry_run?` for any side effect not covered by Automatic side-effect guards (see above).
|
|
149
255
|
|
|
150
256
|
### Choosing a transaction mode (behavior + guidance)
|
|
151
257
|
|
|
@@ -156,8 +262,8 @@ Notes:
|
|
|
156
262
|
- **Behavior**: in commit mode, records are committed one-by-one; errors are collected and the run continues; the overall run fails at the end if any record failed.
|
|
157
263
|
- **Use when**: you want maximum progress and are OK investigating/fixing a subset of failures.
|
|
158
264
|
- **`transaction false` / `:none`**:
|
|
159
|
-
- **Behavior**: no automatic transaction
|
|
160
|
-
- **Use when**: you have intentional external side effects
|
|
265
|
+
- **Behavior**: in commit mode, no automatic transaction; in dry run, the run is still wrapped in a rollback transaction so DB changes are not committed.
|
|
266
|
+
- **Use when**: you have intentional external side effects or your own transaction/locking strategy in commit mode.
|
|
161
267
|
|
|
162
268
|
### Performance and operability (recommended)
|
|
163
269
|
|
|
@@ -182,17 +288,19 @@ def process_record(buyback)
|
|
|
182
288
|
end
|
|
183
289
|
```
|
|
184
290
|
|
|
185
|
-
### `skip!` (count but don
|
|
291
|
+
### `skip!` (count but don't update)
|
|
186
292
|
|
|
187
|
-
Mark a record as skipped (
|
|
293
|
+
Mark a record as skipped. Calling `skip!` terminates the current `process_record` immediately (no `return` needed). The record is counted as "Skipped" in the summary.
|
|
188
294
|
|
|
189
295
|
```ruby
|
|
190
296
|
def process_record(record)
|
|
191
297
|
skip!("already done") if record.foo.present?
|
|
192
|
-
record.update!(foo: value)
|
|
298
|
+
record.update!(foo: value) # not executed if skipped
|
|
193
299
|
end
|
|
194
300
|
```
|
|
195
301
|
|
|
302
|
+
Skip reasons are grouped: the summary shows the top 10 reasons by count (e.g. `"already done" (42), "not eligible" (3)`) instead of logging each skip inline. This keeps the progress bar clean.
|
|
303
|
+
|
|
196
304
|
### Throttling and disabling the progress bar
|
|
197
305
|
|
|
198
306
|
```ruby
|
|
@@ -202,19 +310,29 @@ class SomeShift < DataShifter::Shift
|
|
|
202
310
|
end
|
|
203
311
|
```
|
|
204
312
|
|
|
313
|
+
|
|
205
314
|
## Generator
|
|
206
315
|
|
|
207
316
|
| Command | Generates |
|
|
208
317
|
|--------|----------|
|
|
209
318
|
| `bin/rails generate data_shift backfill_foo` | `lib/data_shifts/<timestamp>_backfill_foo.rb` with a `DataShifts::BackfillFoo` class |
|
|
210
|
-
| `bin/rails generate data_shift backfill_users --model
|
|
319
|
+
| `bin/rails generate data_shift backfill_users --model User` | Same, with `User.all` in `collection` and `process_record(user)` |
|
|
211
320
|
| `bin/rails generate data_shift backfill_users --spec` | Also generates `spec/lib/data_shifts/backfill_users_spec.rb` when RSpec is enabled |
|
|
321
|
+
| `bin/rails generate data_shift fix_order_1234 --task` | Generates a shift with a `task` block instead of `collection`/`process_record` |
|
|
212
322
|
|
|
213
323
|
The generator refuses to create a second shift if it would produce a duplicate rake task name.
|
|
214
324
|
|
|
215
325
|
## Testing shifts (RSpec)
|
|
216
326
|
|
|
217
|
-
This gem ships a small helper module for running shifts in tests:
|
|
327
|
+
This gem ships a small helper module for running shifts in tests. Require it and include `DataShifter::SpecHelper` in specs or in `RSpec.configure` for `type: :data_shift`.
|
|
328
|
+
|
|
329
|
+
**Helpers:**
|
|
330
|
+
|
|
331
|
+
- **`run_data_shift(shift_class, dry_run: true, commit: false)`** — Runs the shift; returns an `Axn::Result`. Use `commit: true` to run in commit mode.
|
|
332
|
+
- **`silence_data_shift_output`** — Suppresses STDOUT for the block (e.g. progress bar).
|
|
333
|
+
- **`capture_data_shift_output`** — Runs the block and returns `[result, output_string]` for asserting on printed output.
|
|
334
|
+
|
|
335
|
+
Use `expect { ... }.not_to change(...)` and `expect { ... }.to change(...)` to assert that data stays unchanged in dry run and changes when committed:
|
|
218
336
|
|
|
219
337
|
```ruby
|
|
220
338
|
require "data_shifter/spec_helper"
|
|
@@ -222,35 +340,29 @@ require "data_shifter/spec_helper"
|
|
|
222
340
|
RSpec.describe DataShifts::BackfillFoo do
|
|
223
341
|
include DataShifter::SpecHelper
|
|
224
342
|
|
|
225
|
-
before { allow($stdout).to receive(:puts) }
|
|
343
|
+
before { allow($stdout).to receive(:puts) }
|
|
226
344
|
|
|
227
345
|
it "does not persist changes in dry run" do
|
|
228
|
-
|
|
229
|
-
|
|
230
|
-
|
|
346
|
+
expect do
|
|
347
|
+
result = run_data_shift(described_class, dry_run: true)
|
|
348
|
+
expect(result).to be_ok
|
|
349
|
+
end.not_to change(Foo, :count)
|
|
231
350
|
end
|
|
232
351
|
|
|
233
352
|
it "persists changes when committed" do
|
|
234
|
-
|
|
235
|
-
|
|
236
|
-
|
|
353
|
+
expect do
|
|
354
|
+
result = run_data_shift(described_class, commit: true)
|
|
355
|
+
expect(result).to be_ok
|
|
356
|
+
end.to change(Foo, :count).by(1)
|
|
357
|
+
# Or for in-place updates: .to change { record.reload.bar }.from(nil).to("baz")
|
|
237
358
|
end
|
|
238
359
|
end
|
|
239
360
|
```
|
|
240
361
|
|
|
241
|
-
## Optional RuboCop cop
|
|
242
|
-
|
|
243
|
-
If you use `transaction false` / `transaction :none`, you should guard writes and side effects with `dry_run?`. You can help avoid mistakes by linting that the helper is at least called once via the bundled cop:
|
|
244
|
-
|
|
245
|
-
```yaml
|
|
246
|
-
# .rubocop.yml
|
|
247
|
-
require:
|
|
248
|
-
- data_shifter/rubocop
|
|
249
|
-
```
|
|
250
|
-
|
|
251
362
|
## Requirements
|
|
252
363
|
|
|
253
364
|
- Ruby ≥ 3.2.1
|
|
254
|
-
- Rails (ActiveRecord, ActiveSupport, Railties) ≥
|
|
365
|
+
- Rails (ActiveRecord, ActiveSupport, Railties) ≥ 7.0
|
|
255
366
|
- `axn` (Shift classes include `Axn`)
|
|
256
367
|
- `ruby-progressbar` (for progress bars)
|
|
368
|
+
- `webmock` (for dry-run HTTP blocking; optional allowlist via `allow_external_requests [...]` / `DataShifter.config.allow_external_requests`)
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module DataShifter
|
|
4
|
+
# Global configuration for DataShifter.
|
|
5
|
+
#
|
|
6
|
+
# Configure via:
|
|
7
|
+
# DataShifter.configure do |config|
|
|
8
|
+
# config.allow_external_requests = ["api.readonly.example.com"]
|
|
9
|
+
# config.suppress_repeated_logs = true
|
|
10
|
+
# end
|
|
11
|
+
#
|
|
12
|
+
# Or access directly:
|
|
13
|
+
# DataShifter.config.progress_enabled = false
|
|
14
|
+
class Configuration
|
|
15
|
+
# Hosts or regexes allowed for HTTP during dry run only (combined with per-shift allow_external_requests).
|
|
16
|
+
# Has no effect in commit mode — HTTP is unrestricted when dry_run is false.
|
|
17
|
+
attr_accessor :allow_external_requests
|
|
18
|
+
|
|
19
|
+
# Whether to suppress repeated log messages during a shift run. Default: true.
|
|
20
|
+
# Can be overridden per shift with `suppress_repeated_logs true/false`.
|
|
21
|
+
attr_accessor :suppress_repeated_logs
|
|
22
|
+
|
|
23
|
+
# Maximum unique log messages to track for deduplication. Default: 1000.
|
|
24
|
+
# When exceeded, entries with count == 1 are cleared first; repeated entries are kept.
|
|
25
|
+
attr_accessor :repeated_log_cap
|
|
26
|
+
|
|
27
|
+
# Global default for progress bar visibility. Default: true.
|
|
28
|
+
# Per-shift `progress true/false` overrides this.
|
|
29
|
+
attr_accessor :progress_enabled
|
|
30
|
+
|
|
31
|
+
# Default status print interval in seconds when ENV STATUS_INTERVAL is not set. Default: nil.
|
|
32
|
+
attr_accessor :status_interval_seconds
|
|
33
|
+
|
|
34
|
+
def initialize
|
|
35
|
+
@allow_external_requests = []
|
|
36
|
+
@suppress_repeated_logs = true
|
|
37
|
+
@repeated_log_cap = 1000
|
|
38
|
+
@progress_enabled = true
|
|
39
|
+
@status_interval_seconds = nil
|
|
40
|
+
end
|
|
41
|
+
end
|
|
42
|
+
end
|
|
@@ -0,0 +1,46 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module DataShifter
|
|
4
|
+
# Raised when a dry run attempts an outbound HTTP request to a host that is
|
|
5
|
+
# not allowed via allow_external_requests (per-shift or global config).
|
|
6
|
+
class ExternalRequestNotAllowedError < StandardError
|
|
7
|
+
def initialize(attempted_host: nil)
|
|
8
|
+
@attempted_host = attempted_host
|
|
9
|
+
super(build_message)
|
|
10
|
+
end
|
|
11
|
+
|
|
12
|
+
attr_reader :attempted_host
|
|
13
|
+
|
|
14
|
+
private
|
|
15
|
+
|
|
16
|
+
def build_message
|
|
17
|
+
intro = if @attempted_host && !@attempted_host.to_s.strip.empty?
|
|
18
|
+
"Dry run blocked an outbound HTTP request to #{@attempted_host}."
|
|
19
|
+
else
|
|
20
|
+
"Dry run blocked an outbound HTTP request."
|
|
21
|
+
end
|
|
22
|
+
|
|
23
|
+
if @attempted_host && !@attempted_host.to_s.strip.empty?
|
|
24
|
+
<<~MSG.strip
|
|
25
|
+
#{intro}
|
|
26
|
+
|
|
27
|
+
To allow this host during dry run, add to your shift class:
|
|
28
|
+
|
|
29
|
+
allow_external_requests ["#{@attempted_host}"]
|
|
30
|
+
|
|
31
|
+
Or set DataShifter.config.allow_external_requests in an initializer.
|
|
32
|
+
MSG
|
|
33
|
+
else
|
|
34
|
+
<<~MSG.strip
|
|
35
|
+
#{intro}
|
|
36
|
+
|
|
37
|
+
To allow specific hosts during dry run, add to your shift class:
|
|
38
|
+
|
|
39
|
+
allow_external_requests ["host.example.com"] # or use a regex
|
|
40
|
+
|
|
41
|
+
Or set DataShifter.config.allow_external_requests in an initializer.
|
|
42
|
+
MSG
|
|
43
|
+
end
|
|
44
|
+
end
|
|
45
|
+
end
|
|
46
|
+
end
|
|
@@ -0,0 +1,71 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module DataShifter
|
|
4
|
+
module Internal
|
|
5
|
+
# ANSI color utilities for CLI output.
|
|
6
|
+
# Automatically detects TTY and respects NO_COLOR environment variable.
|
|
7
|
+
module Colors
|
|
8
|
+
CODES = {
|
|
9
|
+
reset: "\e[0m",
|
|
10
|
+
bold: "\e[1m",
|
|
11
|
+
dim: "\e[2m",
|
|
12
|
+
green: "\e[32m",
|
|
13
|
+
yellow: "\e[33m",
|
|
14
|
+
red: "\e[31m",
|
|
15
|
+
cyan: "\e[36m",
|
|
16
|
+
}.freeze
|
|
17
|
+
|
|
18
|
+
module_function
|
|
19
|
+
|
|
20
|
+
def enabled?(io = $stdout)
|
|
21
|
+
return false if ENV["NO_COLOR"]
|
|
22
|
+
return false unless io.respond_to?(:tty?)
|
|
23
|
+
|
|
24
|
+
io.tty?
|
|
25
|
+
end
|
|
26
|
+
|
|
27
|
+
def wrap(text, *styles, io: $stdout)
|
|
28
|
+
return text unless enabled?(io)
|
|
29
|
+
|
|
30
|
+
codes = styles.map { |s| CODES[s] }.compact.join
|
|
31
|
+
"#{codes}#{text}#{CODES[:reset]}"
|
|
32
|
+
end
|
|
33
|
+
|
|
34
|
+
def bold(text, io: $stdout)
|
|
35
|
+
wrap(text, :bold, io:)
|
|
36
|
+
end
|
|
37
|
+
|
|
38
|
+
def dim(text, io: $stdout)
|
|
39
|
+
wrap(text, :dim, io:)
|
|
40
|
+
end
|
|
41
|
+
|
|
42
|
+
def green(text, io: $stdout)
|
|
43
|
+
wrap(text, :green, io:)
|
|
44
|
+
end
|
|
45
|
+
|
|
46
|
+
def yellow(text, io: $stdout)
|
|
47
|
+
wrap(text, :yellow, io:)
|
|
48
|
+
end
|
|
49
|
+
|
|
50
|
+
def red(text, io: $stdout)
|
|
51
|
+
wrap(text, :red, io:)
|
|
52
|
+
end
|
|
53
|
+
|
|
54
|
+
def cyan(text, io: $stdout)
|
|
55
|
+
wrap(text, :cyan, io:)
|
|
56
|
+
end
|
|
57
|
+
|
|
58
|
+
def success(text, io: $stdout)
|
|
59
|
+
wrap(text, :bold, :green, io:)
|
|
60
|
+
end
|
|
61
|
+
|
|
62
|
+
def warning(text, io: $stdout)
|
|
63
|
+
wrap(text, :bold, :yellow, io:)
|
|
64
|
+
end
|
|
65
|
+
|
|
66
|
+
def error(text, io: $stdout)
|
|
67
|
+
wrap(text, :bold, :red, io:)
|
|
68
|
+
end
|
|
69
|
+
end
|
|
70
|
+
end
|
|
71
|
+
end
|
|
@@ -18,14 +18,16 @@ module DataShifter
|
|
|
18
18
|
end
|
|
19
19
|
end
|
|
20
20
|
|
|
21
|
-
# Parse STATUS_INTERVAL environment variable.
|
|
22
|
-
# Returns nil if not set
|
|
21
|
+
# Parse STATUS_INTERVAL environment variable, falling back to config.
|
|
22
|
+
# Returns nil if not set/invalid and config is nil.
|
|
23
23
|
def status_interval_seconds
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
24
|
+
if ENV["STATUS_INTERVAL"].present?
|
|
25
|
+
Integer(ENV.fetch("STATUS_INTERVAL", nil), 10)
|
|
26
|
+
else
|
|
27
|
+
DataShifter.config.status_interval_seconds
|
|
28
|
+
end
|
|
27
29
|
rescue ArgumentError
|
|
28
|
-
|
|
30
|
+
DataShifter.config.status_interval_seconds
|
|
29
31
|
end
|
|
30
32
|
|
|
31
33
|
# Get CONTINUE_FROM environment variable value.
|