data_porter 2.5.1 → 2.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 99b0792c8088d9d2e55826f898fcfa99d872bea179f41d352b1e4aaff9ab14d8
4
- data.tar.gz: 77c539c666176f992747f2d326a9ce8284cb19426b5f1ae470a317f7dfd740f1
3
+ metadata.gz: 955a886124d8ff2f1da4f23e725a52caec3bbc635450f058e3bbce81f4b898f5
4
+ data.tar.gz: 6f1f0be41999d105c7558b7192f3f61d79f935431ce7b99bede5753d69be9ce3
5
5
  SHA512:
6
- metadata.gz: b863d64f885f55ba6f530ce99173fef55ca473f74c7b0fc6736890bd770583949cf2101389c0635707ea38ceede63aebc529fc1a1e5990c76bf5c0750de318d6
7
- data.tar.gz: 392ebe242c8ae947b37961bc8d59373a3d9f6a84730f0c946cd35af0a621ecc1a1cfb8b09b361f3a74fcc826c098fad317a4350d9b213206197406d88f763a01
6
+ metadata.gz: 399c87e6daa56196ae96525ca75b27a464a109463d3f4ebf5361738a93cc6385b30e37e588cab45c3a2218b99596dabcd9a48b0a9856eac17902907a9f4f9a34
7
+ data.tar.gz: f423073b27fc407cc94ecc8e11693d40da5210387129f428105ac4973d7b7be034a59b8161c2a4da3c99acc7e4ec1b13d4828c54336e12d441044549b6243151
data/CHANGELOG.md CHANGED
@@ -11,6 +11,24 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
11
11
 
12
12
  - **Auto-map heuristics** -- Smart column suggestions that pre-fill mapping selects when CSV/XLSX headers match target fields by exact name or built-in synonym (e.g. "E-mail Address" → email, "fname" → first_name). Supports per-column custom synonyms via `synonyms:` keyword in column DSL. Fallback chain: saved mapping > code-defined > auto-map > empty
13
13
 
14
+ ## [2.6.0] - 2026-02-21
15
+
16
+ ### Added
17
+
18
+ - **Resume on failure** -- When an import fails mid-way (crash, timeout, exception), resume from the last successful record instead of re-importing from scratch. Progress checkpoints stored in the existing `config` JSONB column alongside `broadcast_progress` — zero additional DB operations or migrations. Works with both per-record and bulk import modes
19
+ - `resumable?` predicate on `DataImport` — returns `true` when a failed import has a checkpoint with processed records
20
+ - Resume button in the failed import UI (primary action), with Retry demoted to secondary
21
+ - `POST :resume` route on the imports controller
22
+
23
+ ### Fixed
24
+
25
+ - `handle_failure` now preserves existing report data (parsed counts, partial results) instead of creating a new empty report
26
+ - `parse!` now clears stale checkpoint and progress data from previous import attempts
27
+
28
+ ### Changed
29
+
30
+ - 574 RSpec examples (up from 551), 0 failures
31
+
14
32
  ## [2.5.1] - 2026-02-21
15
33
 
16
34
  ### Fixed
data/README.md CHANGED
@@ -103,6 +103,8 @@ pending -> parsing -> previewing -> importing -> completed
103
103
 
104
104
  **[Full documentation on GitHub Pages](https://seryllns.github.io/data_porter/)**
105
105
 
106
+ > **Build series**: Want to see how DataPorter was built step by step? [Building DataPorter on dev.to](https://dev.to/seryllns_/series/35813) -- 30 parts covering architecture, TDD, and every feature from first commit to production.
107
+
106
108
  | Topic | Description |
107
109
  |---|---|
108
110
  | [Configuration](docs/CONFIGURATION.md) | All options, authentication, context builder, real-time updates |
data/ROADMAP.md CHANGED
@@ -6,10 +6,6 @@
6
6
 
7
7
  Support update (upsert) imports alongside create-only. Given a `deduplicate_by` key, detect existing records and show a diff preview: new records, changed fields (highlighted), unchanged rows. User confirms which changes to apply. Enables recurring data sync workflows.
8
8
 
9
- ### Resume / retry on failure
10
-
11
- If an import fails mid-way (timeout, crash, transient error), resume from the last successful record instead of restarting from scratch. Track a checkpoint index in the report. Critical for large imports (5k+ records) where re-processing everything is not acceptable.
12
-
13
9
  ### API pagination
14
10
 
15
11
  Support paginated API sources. The current API source does a single GET, which works for small datasets but not for APIs returning thousands of records across multiple pages. Support offset, cursor, and link-header pagination strategies via `api_config`:
@@ -10,7 +10,7 @@ module DataPorter
10
10
  layout "data_porter/application"
11
11
 
12
12
  before_action :set_import, only: %i[show parse confirm cancel dry_run update_mapping
13
- status export_rejects destroy back_to_mapping]
13
+ status export_rejects destroy back_to_mapping resume]
14
14
  before_action :load_targets, only: %i[index new create]
15
15
 
16
16
  def index
@@ -69,6 +69,12 @@ module DataPorter
69
69
  redirect_to import_path(@import)
70
70
  end
71
71
 
72
+ def resume
73
+ @import.update!(status: :pending)
74
+ DataPorter::ImportJob.perform_later(@import.id)
75
+ redirect_to import_path(@import)
76
+ end
77
+
72
78
  def dry_run
73
79
  @import.update!(status: :pending)
74
80
  DataPorter::DryRunJob.perform_later(@import.id)
@@ -53,12 +53,16 @@ module DataPorter
53
53
  records.group_by(&:status).transform_values(&:count)
54
54
  end
55
55
 
56
+ def resumable?
57
+ failed? && config&.dig("checkpoint", "processed").to_i.positive?
58
+ end
59
+
56
60
  def reset_to_mapping!
57
61
  update!(
58
62
  status: :mapping,
59
63
  records: [],
60
64
  report: StoreModels::Report.new,
61
- config: (config || {}).except("progress")
65
+ config: (config || {}).except("progress", "checkpoint")
62
66
  )
63
67
  end
64
68
 
@@ -103,8 +103,12 @@
103
103
  <% if @import.failed? %>
104
104
  <%= raw DataPorter::Components::Shared::FailureAlert.new(report: @import.report).call %>
105
105
  <div class="dp-actions">
106
+ <% if @import.resumable? %>
107
+ <%= button_to t("data_porter.imports.resume"), resume_import_path(@import),
108
+ method: :post, class: "dp-btn dp-btn--primary" %>
109
+ <% end %>
106
110
  <%= button_to t("data_porter.imports.retry"), parse_import_path(@import),
107
- method: :post, class: "dp-btn dp-btn--primary" %>
111
+ method: :post, class: "dp-btn dp-btn--secondary" %>
108
112
  <%= button_to t("data_porter.imports.delete"), import_path(@import),
109
113
  method: :delete, class: "dp-btn dp-btn--danger",
110
114
  data: { turbo_confirm: t("data_porter.imports.delete_confirm") } %>
@@ -10,6 +10,7 @@ en:
10
10
  delete: "Delete"
11
11
  delete_confirm: "Delete this import?"
12
12
  retry: "Retry"
13
+ resume: "Resume"
13
14
  start_import: "Start Import"
14
15
  confirm_import: "Confirm Import"
15
16
  dry_run: "Dry Run"
@@ -10,6 +10,7 @@ fr:
10
10
  delete: "Supprimer"
11
11
  delete_confirm: "Supprimer cet import ?"
12
12
  retry: "Réessayer"
13
+ resume: "Reprendre"
13
14
  start_import: "Lancer l'import"
14
15
  confirm_import: "Confirmer l'import"
15
16
  dry_run: "Essai à blanc"
data/config/routes.rb CHANGED
@@ -10,6 +10,7 @@ DataPorter::Engine.routes.draw do
10
10
  post :cancel
11
11
  post :back_to_mapping
12
12
  post :dry_run
13
+ post :resume
13
14
  patch :update_mapping
14
15
  get :status
15
16
  get :export_rejects
@@ -7,46 +7,54 @@ module DataPorter
7
7
 
8
8
  def import_bulk
9
9
  importable = @data_import.importable_records
10
- context = build_context
11
- config = @target.class._bulk_config
12
- results = { created: 0, errored: 0 }
13
- total = importable.size
14
- processed = 0
15
-
16
- importable.each_slice(config[:batch_size]) do |batch|
17
- persist_batch_with_fallback(batch, context, config, results)
18
- processed += batch.size
19
- broadcast_progress(processed, total)
20
- end
10
+ checkpoint = load_checkpoint
11
+ @bulk_state = build_bulk_state(importable, checkpoint)
12
+
13
+ process_batches(importable.drop(checkpoint[:processed]))
14
+ finalize_import(@bulk_state[:results])
15
+ end
16
+
17
+ def build_bulk_state(importable, checkpoint)
18
+ {
19
+ context: build_context,
20
+ bulk_config: @target.class._bulk_config,
21
+ results: seed_results(checkpoint),
22
+ total: importable.size,
23
+ processed: checkpoint[:processed]
24
+ }
25
+ end
21
26
 
22
- finalize_import(results)
27
+ def process_batches(records)
28
+ records.each_slice(@bulk_state[:bulk_config][:batch_size]) do |batch|
29
+ persist_batch_with_fallback(batch)
30
+ @bulk_state[:processed] += batch.size
31
+ broadcast_progress(@bulk_state[:processed], @bulk_state[:total], results: @bulk_state[:results])
32
+ end
23
33
  end
24
34
 
25
- def persist_batch_with_fallback(batch, context, config, results)
26
- @target.persist_batch(batch, context: context)
27
- results[:created] += batch.size
35
+ def persist_batch_with_fallback(batch)
36
+ @target.persist_batch(batch, context: @bulk_state[:context])
37
+ @bulk_state[:results][:created] += batch.size
28
38
  rescue StandardError => e
29
- handle_batch_failure(batch, context, config, results, e)
39
+ handle_batch_failure(batch, e)
30
40
  end
31
41
 
32
- def handle_batch_failure(batch, context, config, results, error)
33
- if config[:on_conflict] == :fail_batch
34
- fail_batch(batch, results, error)
42
+ def handle_batch_failure(batch, error)
43
+ if @bulk_state[:bulk_config][:on_conflict] == :fail_batch
44
+ fail_batch(batch, error)
35
45
  else
36
- retry_per_record(batch, context, results)
46
+ retry_per_record(batch)
37
47
  end
38
48
  end
39
49
 
40
- def fail_batch(batch, results, error)
41
- batch.each do |record|
42
- record.add_error(error.message)
43
- end
44
- results[:errored] += batch.size
50
+ def fail_batch(batch, error)
51
+ batch.each { |record| record.add_error(error.message) }
52
+ @bulk_state[:results][:errored] += batch.size
45
53
  end
46
54
 
47
- def retry_per_record(batch, context, results)
55
+ def retry_per_record(batch)
48
56
  batch.each do |record|
49
- persist_record(record, context, results)
57
+ persist_record(record, @bulk_state[:context], @bulk_state[:results])
50
58
  end
51
59
  end
52
60
  end
@@ -18,12 +18,14 @@ module DataPorter
18
18
  def import_per_record
19
19
  importable = @data_import.importable_records
20
20
  context = build_context
21
- results = { created: 0, errored: 0 }
21
+ checkpoint = load_checkpoint
22
+ results = seed_results(checkpoint)
23
+ remaining = importable.drop(checkpoint[:processed])
22
24
  total = importable.size
23
25
 
24
- importable.each_with_index do |record, index|
26
+ remaining.each_with_index do |record, index|
25
27
  persist_record(record, context, results)
26
- broadcast_progress(index + 1, total)
28
+ broadcast_progress(checkpoint[:processed] + index + 1, total, results: results)
27
29
  end
28
30
 
29
31
  finalize_import(results)
@@ -45,6 +47,7 @@ module DataPorter
45
47
  end
46
48
 
47
49
  def finalize_import(results)
50
+ clear_checkpoint
48
51
  @data_import.update!(status: :completed)
49
52
  @broadcaster.success
50
53
  WebhookNotifier.notify(@data_import, "import.completed")
@@ -66,6 +69,25 @@ module DataPorter
66
69
  report.errored_count = results[:errored]
67
70
  @data_import.update!(report: report)
68
71
  end
72
+
73
+ def load_checkpoint
74
+ cp = @data_import.config&.dig("checkpoint") || {}
75
+ {
76
+ processed: cp["processed"].to_i,
77
+ created: cp["created"].to_i,
78
+ errored: cp["errored"].to_i
79
+ }
80
+ end
81
+
82
+ def seed_results(checkpoint)
83
+ { created: checkpoint[:created], errored: checkpoint[:errored] }
84
+ end
85
+
86
+ def clear_checkpoint
87
+ config = @data_import.config || {}
88
+ config.delete("checkpoint")
89
+ @data_import.update_column(:config, config)
90
+ end
69
91
  end
70
92
  end
71
93
  end
@@ -32,6 +32,7 @@ module DataPorter
32
32
  def parse!
33
33
  @data_import.parsing!
34
34
  records = build_records
35
+ clear_stale_import_data
35
36
  @data_import.update!(records: records, status: :previewing)
36
37
  build_report
37
38
  WebhookNotifier.notify(@data_import, "import.parsed")
@@ -92,18 +93,36 @@ module DataPorter
92
93
  DataPorter.configuration.context_builder&.call(@data_import)
93
94
  end
94
95
 
95
- def broadcast_progress(current, total)
96
- percentage = ((current.to_f / total) * 100).round
96
+ def broadcast_progress(current, total, results: nil)
97
97
  config = @data_import.config || {}
98
- config["progress"] = { "current" => current, "total" => total, "percentage" => percentage }
98
+ config["progress"] = { "current" => current, "total" => total, "percentage" => pct(current, total) }
99
+ save_checkpoint(config, current, results) if results
99
100
  @data_import.update_column(:config, config)
100
101
  @broadcaster.progress(current, total)
101
102
  end
102
103
 
104
+ def pct(current, total)
105
+ ((current.to_f / total) * 100).round
106
+ end
107
+
108
+ def save_checkpoint(config, processed, results)
109
+ config["checkpoint"] = {
110
+ "processed" => processed,
111
+ "created" => results[:created],
112
+ "errored" => results[:errored]
113
+ }
114
+ end
115
+
116
+ def clear_stale_import_data
117
+ config = @data_import.config || {}
118
+ config.delete("checkpoint")
119
+ config.delete("progress")
120
+ @data_import.config = config
121
+ end
122
+
103
123
  def handle_failure(error)
104
- report = StoreModels::Report.new(
105
- error_reports: [StoreModels::Error.new(message: error.message)]
106
- )
124
+ report = @data_import.report || StoreModels::Report.new
125
+ report.error_reports = [StoreModels::Error.new(message: error.message)]
107
126
  @data_import.update!(status: :failed, report: report)
108
127
  @broadcaster.failure(error.message)
109
128
  WebhookNotifier.notify(@data_import, "import.failed")
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module DataPorter
4
- VERSION = "2.5.1"
4
+ VERSION = "2.6.0"
5
5
  end
data/mkdocs.yml CHANGED
@@ -94,7 +94,7 @@ nav:
94
94
  - Column Mapping: MAPPING.md
95
95
  - Views & Theming: VIEWS.md
96
96
  - Routes: routes.md
97
- - Advanced: ADVANCED.md
97
+ - Advanced: ADVANCED.md
98
98
  - Roadmap: ROADMAP.md
99
99
  - Changelog: changelog.md
100
100
  - Contributing: contributing.md
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: data_porter
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.5.1
4
+ version: 2.6.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Seryl Lounis