data_porter 2.4.0 → 2.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: b393c618ddb47334a61a62de03796211188b58367cad71bfc8574874208a34db
4
- data.tar.gz: 9eaa6b01ea0127bd22fa8557cccc34f47070e58991512f66996df6f9b789e8e3
3
+ metadata.gz: 955a886124d8ff2f1da4f23e725a52caec3bbc635450f058e3bbce81f4b898f5
4
+ data.tar.gz: 6f1f0be41999d105c7558b7192f3f61d79f935431ce7b99bede5753d69be9ce3
5
5
  SHA512:
6
- metadata.gz: 9f5e73c6eb11dccd143496f4147c941d2729a17643b10eed46f092e4615a2a861d1359b6993dadddaa7ab256672646a25b93f73f77bf238b1b4543fb7c92eb26
7
- data.tar.gz: 3f2c4be7e4dbe828a57bd81709c64b5c592c71aabaae0e604643216455c2c3bfc7ef74d14b78241658104c996125495587cfff73301c5a8f918abca64aca8e4f
6
+ metadata.gz: 399c87e6daa56196ae96525ca75b27a464a109463d3f4ebf5361738a93cc6385b30e37e588cab45c3a2218b99596dabcd9a48b0a9856eac17902907a9f4f9a34
7
+ data.tar.gz: f423073b27fc407cc94ecc8e11693d40da5210387129f428105ac4973d7b7be034a59b8161c2a4da3c99acc7e4ec1b13d4828c54336e12d441044549b6243151
data/CHANGELOG.md CHANGED
@@ -5,6 +5,46 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [Unreleased]
9
+
10
+ ### Added
11
+
12
+ - **Auto-map heuristics** -- Smart column suggestions that pre-fill mapping selects when CSV/XLSX headers match target fields by exact name or built-in synonym (e.g. "E-mail Address" → email, "fname" → first_name). Supports per-column custom synonyms via `synonyms:` keyword in column DSL. Fallback chain: saved mapping > code-defined > auto-map > empty
13
+
14
+ ## [2.6.0] - 2026-02-21
15
+
16
+ ### Added
17
+
18
+ - **Resume on failure** -- When an import fails mid-way (crash, timeout, exception), resume from the last successful record instead of re-importing from scratch. Progress checkpoints stored in the existing `config` JSONB column alongside `broadcast_progress` — zero additional DB operations or migrations. Works with both per-record and bulk import modes
19
+ - `resumable?` predicate on `DataImport` — returns `true` when a failed import has a checkpoint with processed records
20
+ - Resume button in the failed import UI (primary action), with Retry demoted to secondary
21
+ - `POST :resume` route on the imports controller
22
+
23
+ ### Fixed
24
+
25
+ - `handle_failure` now preserves existing report data (parsed counts, partial results) instead of creating a new empty report
26
+ - `parse!` now clears stale checkpoint and progress data from previous import attempts
27
+
28
+ ### Changed
29
+
30
+ - 574 RSpec examples (up from 551), 0 failures
31
+
32
+ ## [2.5.1] - 2026-02-21
33
+
34
+ ### Fixed
35
+
36
+ - Display target icon (from `icon` DSL) in the imports index table and show page title/details. Previously the icon was stored in the Registry but never rendered in the UI
37
+
38
+ ## [2.5.0] - 2026-02-21
39
+
40
+ ### Added
41
+
42
+ - **Bulk import mode** -- Opt-in per-target batch persistence via `bulk_mode batch_size: 500, on_conflict: :retry_per_record`. Uses `insert_all` by default (with auto-injected timestamps) for 10-100x throughput on simple create scenarios. Custom batch logic via `persist_batch` override. Configurable conflict strategy: `:retry_per_record` (default) retries failed batches record-by-record, `:fail_batch` marks entire batch as errored. Progress broadcasts per batch instead of per record
43
+
44
+ ### Changed
45
+
46
+ - 551 RSpec examples (up from 540), 0 failures
47
+
8
48
  ## [2.4.0] - 2026-02-21
9
49
 
10
50
  ### Added
data/README.md CHANGED
@@ -103,6 +103,8 @@ pending -> parsing -> previewing -> importing -> completed
103
103
 
104
104
  **[Full documentation on GitHub Pages](https://seryllns.github.io/data_porter/)**
105
105
 
106
+ > **Build series**: Want to see how DataPorter was built step by step? [Building DataPorter on dev.to](https://dev.to/seryllns_/series/35813) -- 30 parts covering architecture, TDD, and every feature from first commit to production.
107
+
106
108
  | Topic | Description |
107
109
  |---|---|
108
110
  | [Configuration](docs/CONFIGURATION.md) | All options, authentication, context builder, real-time updates |
data/ROADMAP.md CHANGED
@@ -2,18 +2,10 @@
2
2
 
3
3
  ## Next
4
4
 
5
- ### Bulk import
6
-
7
- High-volume import support using `insert_all` / `upsert_all` for batch persistence. Opt-in per target to bypass per-record `persist` calls, enabling 10-100x throughput for simple create/upsert scenarios. Configurable batch size, with fallback to per-record mode on conflict.
8
-
9
5
  ### Update & diff mode
10
6
 
11
7
  Support update (upsert) imports alongside create-only. Given a `deduplicate_by` key, detect existing records and show a diff preview: new records, changed fields (highlighted), unchanged rows. User confirms which changes to apply. Enables recurring data sync workflows.
12
8
 
13
- ### Resume / retry on failure
14
-
15
- If an import fails mid-way (timeout, crash, transient error), resume from the last successful record instead of restarting from scratch. Track a checkpoint index in the report. Critical for large imports (5k+ records) where re-processing everything is not acceptable.
16
-
17
9
  ### API pagination
18
10
 
19
11
  Support paginated API sources. The current API source does a single GET, which works for small datasets but not for APIs returning thousands of records across multiple pages. Support offset, cursor, and link-header pagination strategies via `api_config`:
@@ -34,10 +26,6 @@ Headless REST API for programmatic imports:
34
26
  - Auth via `config.api_authenticate` lambda (API key or Bearer token)
35
27
  - Reuses existing job pipeline (parse, import, dry run)
36
28
 
37
- ### Auto-map heuristics
38
-
39
- Smart column mapping suggestions using tokenized header matching and synonym dictionaries. When a CSV has "E-mail Address", auto-suggest mapping to `:email`. Built-in synonyms for common patterns (phone → phone_number, first name → first_name). Configurable synonym lists per target.
40
-
41
29
  ---
42
30
 
43
31
  ## Ideas
@@ -27,7 +27,21 @@ module DataPorter
27
27
  saved = @import.config&.dig("column_mapping")
28
28
  return saved if saved.present?
29
29
 
30
- (target._csv_mappings || {}).transform_values(&:to_s)
30
+ code_mapping = (target._csv_mappings || {}).transform_values(&:to_s)
31
+ return code_mapping if code_mapping.present?
32
+
33
+ auto_map_suggestions(target)
34
+ end
35
+
36
+ def auto_map_suggestions(target)
37
+ columns = target._columns || []
38
+ return {} if columns.empty? || @file_headers.empty?
39
+
40
+ custom = columns.each_with_object({}) do |col, hash|
41
+ hash[col.name] = col.synonyms if col.synonyms.any?
42
+ end
43
+
44
+ AutoMapper.new(@file_headers, columns.map(&:name), custom_synonyms: custom).call
31
45
  end
32
46
 
33
47
  def save_column_mapping
@@ -10,7 +10,7 @@ module DataPorter
10
10
  layout "data_porter/application"
11
11
 
12
12
  before_action :set_import, only: %i[show parse confirm cancel dry_run update_mapping
13
- status export_rejects destroy back_to_mapping]
13
+ status export_rejects destroy back_to_mapping resume]
14
14
  before_action :load_targets, only: %i[index new create]
15
15
 
16
16
  def index
@@ -69,6 +69,12 @@ module DataPorter
69
69
  redirect_to import_path(@import)
70
70
  end
71
71
 
72
+ def resume
73
+ @import.update!(status: :pending)
74
+ DataPorter::ImportJob.perform_later(@import.id)
75
+ redirect_to import_path(@import)
76
+ end
77
+
72
78
  def dry_run
73
79
  @import.update!(status: :pending)
74
80
  DataPorter::DryRunJob.perform_later(@import.id)
@@ -53,12 +53,16 @@ module DataPorter
53
53
  records.group_by(&:status).transform_values(&:count)
54
54
  end
55
55
 
56
+ def resumable?
57
+ failed? && config&.dig("checkpoint", "processed").to_i.positive?
58
+ end
59
+
56
60
  def reset_to_mapping!
57
61
  update!(
58
62
  status: :mapping,
59
63
  records: [],
60
64
  report: StoreModels::Report.new,
61
- config: (config || {}).except("progress")
65
+ config: (config || {}).except("progress", "checkpoint")
62
66
  )
63
67
  end
64
68
 
@@ -28,7 +28,7 @@
28
28
  <% @imports.each do |import| %>
29
29
  <tr>
30
30
  <td><%= import.id %></td>
31
- <td><%= import.target_key %></td>
31
+ <td><% target_cls = import.target_class rescue nil %><% if target_cls&._icon.present? %><i class="<%= target_cls._icon %>"></i> <% end %><%= target_cls&._label || import.target_key %></td>
32
32
  <td><%= import.source_type %></td>
33
33
  <td><%= raw DataPorter::Components::Shared::StatusBadge.new(status: import.status).call %></td>
34
34
  <td><%= import.created_at&.strftime("%Y-%m-%d %H:%M") %></td>
@@ -4,7 +4,7 @@
4
4
  <%= link_to t("data_porter.imports.back_to_imports"), imports_path, class: "dp-btn dp-btn--secondary" %>
5
5
  </div>
6
6
  <h1 class="dp-title">
7
- <%= t("data_porter.imports.show_title", target: @target._label, id: @import.id) %>
7
+ <% if @target._icon.present? %><i class="<%= @target._icon %>"></i> <% end %><%= t("data_porter.imports.show_title", target: @target._label, id: @import.id) %>
8
8
  </h1>
9
9
  <%= raw DataPorter::Components::Shared::StatusBadge.new(status: @import.status).call %>
10
10
  </div>
@@ -12,7 +12,7 @@
12
12
  <div class="dp-import-details">
13
13
  <dl class="dp-details-grid">
14
14
  <dt><%= t("data_porter.imports.details.target") %></dt>
15
- <dd><%= @target._label %></dd>
15
+ <dd><% if @target._icon.present? %><i class="<%= @target._icon %>"></i> <% end %><%= @target._label %></dd>
16
16
  <dt><%= t("data_porter.imports.details.source") %></dt>
17
17
  <dd><%= @import.source_type.upcase %></dd>
18
18
  <% if @import.file.attached? %>
@@ -103,8 +103,12 @@
103
103
  <% if @import.failed? %>
104
104
  <%= raw DataPorter::Components::Shared::FailureAlert.new(report: @import.report).call %>
105
105
  <div class="dp-actions">
106
+ <% if @import.resumable? %>
107
+ <%= button_to t("data_porter.imports.resume"), resume_import_path(@import),
108
+ method: :post, class: "dp-btn dp-btn--primary" %>
109
+ <% end %>
106
110
  <%= button_to t("data_porter.imports.retry"), parse_import_path(@import),
107
- method: :post, class: "dp-btn dp-btn--primary" %>
111
+ method: :post, class: "dp-btn dp-btn--secondary" %>
108
112
  <%= button_to t("data_porter.imports.delete"), import_path(@import),
109
113
  method: :delete, class: "dp-btn dp-btn--danger",
110
114
  data: { turbo_confirm: t("data_porter.imports.delete_confirm") } %>
@@ -6,6 +6,7 @@
6
6
  <title>DataPorter</title>
7
7
  <%= csrf_meta_tags %>
8
8
  <%= stylesheet_link_tag "data_porter/application" %>
9
+ <link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/6.5.1/css/all.min.css" integrity="sha512-DTOQO9RWCH3ppGqcWaEA1BIZOC6xxalwEsw9c2QQeAIftl+Vegovlnee1c9QX4TctnWMn13TZye+giMm8e2LwA==" crossorigin="anonymous" referrerpolicy="no-referrer" />
9
10
  <script type="importmap">
10
11
  {
11
12
  "imports": {
@@ -10,6 +10,7 @@ en:
10
10
  delete: "Delete"
11
11
  delete_confirm: "Delete this import?"
12
12
  retry: "Retry"
13
+ resume: "Resume"
13
14
  start_import: "Start Import"
14
15
  confirm_import: "Confirm Import"
15
16
  dry_run: "Dry Run"
@@ -10,6 +10,7 @@ fr:
10
10
  delete: "Supprimer"
11
11
  delete_confirm: "Supprimer cet import ?"
12
12
  retry: "Réessayer"
13
+ resume: "Reprendre"
13
14
  start_import: "Lancer l'import"
14
15
  confirm_import: "Confirmer l'import"
15
16
  dry_run: "Essai à blanc"
data/config/routes.rb CHANGED
@@ -10,6 +10,7 @@ DataPorter::Engine.routes.draw do
10
10
  post :cancel
11
11
  post :back_to_mapping
12
12
  post :dry_run
13
+ post :resume
13
14
  patch :update_mapping
14
15
  get :status
15
16
  get :export_rejects
@@ -0,0 +1,87 @@
1
+ # frozen_string_literal: true
2
+
3
+ module DataPorter
4
+ class AutoMapper
5
+ SYNONYMS = {
6
+ email: %w[email e_mail email_address e_mail_address courriel mail],
7
+ first_name: %w[first_name firstname fname first prenom],
8
+ last_name: %w[last_name lastname lname last nom],
9
+ name: %w[name full_name fullname nom_complet],
10
+ phone_number: %w[phone_number phone tel telephone mobile cell],
11
+ address: %w[address addr street adresse],
12
+ city: %w[city ville town],
13
+ zip_code: %w[zip_code zip postal_code postcode code_postal],
14
+ country: %w[country pays nation],
15
+ company: %w[company company_name organization organisation entreprise societe],
16
+ title: %w[title job_title position titre poste],
17
+ description: %w[description desc notes],
18
+ quantity: %w[quantity qty amount],
19
+ price: %w[price unit_price prix montant],
20
+ date: %w[date created_at updated_at],
21
+ status: %w[status state statut etat],
22
+ id: %w[id identifier external_id ref reference]
23
+ }.freeze
24
+
25
+ def initialize(headers, target_columns, custom_synonyms: {})
26
+ @headers = headers
27
+ @target_columns = target_columns.map(&:to_s)
28
+ @custom_synonyms = custom_synonyms
29
+ end
30
+
31
+ def call
32
+ used = Set.new
33
+ @headers.each_with_object({}) do |header, mapping|
34
+ match = find_match(header, used)
35
+ used.add(match) if match
36
+ mapping[header] = match || ""
37
+ end
38
+ end
39
+
40
+ private
41
+
42
+ def find_match(header, used)
43
+ normalized = normalize(header)
44
+ return nil if normalized.empty?
45
+
46
+ exact_match(normalized, used) || synonym_match(normalized, used)
47
+ end
48
+
49
+ def exact_match(normalized, used)
50
+ @target_columns.find { |col| col == normalized && !used.include?(col) }
51
+ end
52
+
53
+ def synonym_match(normalized, used)
54
+ lookup_table[normalized]&.find { |col| !used.include?(col) }
55
+ end
56
+
57
+ def lookup_table
58
+ @lookup_table ||= build_lookup_table
59
+ end
60
+
61
+ def build_lookup_table
62
+ table = Hash.new { |h, k| h[k] = [] }
63
+ merged_synonyms.each do |column, synonyms|
64
+ col_name = column.to_s
65
+ next unless @target_columns.include?(col_name)
66
+
67
+ synonyms.each { |syn| table[syn] << col_name }
68
+ end
69
+ table
70
+ end
71
+
72
+ def merged_synonyms
73
+ result = SYNONYMS.transform_values(&:dup)
74
+ @custom_synonyms.each do |column, syns|
75
+ key = column.to_sym
76
+ result[key] = (result.fetch(key, []) + syns.map { |s| normalize(s) }).uniq
77
+ end
78
+ result
79
+ end
80
+
81
+ def normalize(header)
82
+ return "" if header.nil?
83
+
84
+ header.to_s.strip.downcase.gsub(/[\s-]+/, "_").gsub(/[^a-z0-9_]/, "")
85
+ end
86
+ end
87
+ end
@@ -2,14 +2,15 @@
2
2
 
3
3
  module DataPorter
4
4
  module DSL
5
- Column = Struct.new(:name, :type, :required, :label, :transform, :options, keyword_init: true) do
6
- def initialize(name:, type: :string, required: false, label: nil, transform: [], **options)
5
+ Column = Struct.new(:name, :type, :required, :label, :transform, :synonyms, :options, keyword_init: true) do
6
+ def initialize(name:, type: :string, required: false, label: nil, transform: [], synonyms: [], **options)
7
7
  super(
8
8
  name: name.to_sym,
9
9
  type: type.to_sym,
10
10
  required: required,
11
11
  label: label || name.to_s.humanize,
12
12
  transform: Array(transform),
13
+ synonyms: Array(synonyms),
13
14
  options: options
14
15
  )
15
16
  end
@@ -0,0 +1,62 @@
1
+ # frozen_string_literal: true
2
+
3
+ module DataPorter
4
+ class Orchestrator
5
+ module BulkImporter
6
+ private
7
+
8
+ def import_bulk
9
+ importable = @data_import.importable_records
10
+ checkpoint = load_checkpoint
11
+ @bulk_state = build_bulk_state(importable, checkpoint)
12
+
13
+ process_batches(importable.drop(checkpoint[:processed]))
14
+ finalize_import(@bulk_state[:results])
15
+ end
16
+
17
+ def build_bulk_state(importable, checkpoint)
18
+ {
19
+ context: build_context,
20
+ bulk_config: @target.class._bulk_config,
21
+ results: seed_results(checkpoint),
22
+ total: importable.size,
23
+ processed: checkpoint[:processed]
24
+ }
25
+ end
26
+
27
+ def process_batches(records)
28
+ records.each_slice(@bulk_state[:bulk_config][:batch_size]) do |batch|
29
+ persist_batch_with_fallback(batch)
30
+ @bulk_state[:processed] += batch.size
31
+ broadcast_progress(@bulk_state[:processed], @bulk_state[:total], results: @bulk_state[:results])
32
+ end
33
+ end
34
+
35
+ def persist_batch_with_fallback(batch)
36
+ @target.persist_batch(batch, context: @bulk_state[:context])
37
+ @bulk_state[:results][:created] += batch.size
38
+ rescue StandardError => e
39
+ handle_batch_failure(batch, e)
40
+ end
41
+
42
+ def handle_batch_failure(batch, error)
43
+ if @bulk_state[:bulk_config][:on_conflict] == :fail_batch
44
+ fail_batch(batch, error)
45
+ else
46
+ retry_per_record(batch)
47
+ end
48
+ end
49
+
50
+ def fail_batch(batch, error)
51
+ batch.each { |record| record.add_error(error.message) }
52
+ @bulk_state[:results][:errored] += batch.size
53
+ end
54
+
55
+ def retry_per_record(batch)
56
+ batch.each do |record|
57
+ persist_record(record, @bulk_state[:context], @bulk_state[:results])
58
+ end
59
+ end
60
+ end
61
+ end
62
+ end
@@ -6,7 +6,9 @@ module DataPorter
6
6
  private
7
7
 
8
8
  def import_records
9
- if DataPorter.configuration.transaction_mode == :all
9
+ if @target.class._bulk_config
10
+ import_bulk
11
+ elsif DataPorter.configuration.transaction_mode == :all
10
12
  import_all_or_nothing
11
13
  else
12
14
  import_per_record
@@ -16,12 +18,14 @@ module DataPorter
16
18
  def import_per_record
17
19
  importable = @data_import.importable_records
18
20
  context = build_context
19
- results = { created: 0, errored: 0 }
21
+ checkpoint = load_checkpoint
22
+ results = seed_results(checkpoint)
23
+ remaining = importable.drop(checkpoint[:processed])
20
24
  total = importable.size
21
25
 
22
- importable.each_with_index do |record, index|
26
+ remaining.each_with_index do |record, index|
23
27
  persist_record(record, context, results)
24
- broadcast_progress(index + 1, total)
28
+ broadcast_progress(checkpoint[:processed] + index + 1, total, results: results)
25
29
  end
26
30
 
27
31
  finalize_import(results)
@@ -43,6 +47,7 @@ module DataPorter
43
47
  end
44
48
 
45
49
  def finalize_import(results)
50
+ clear_checkpoint
46
51
  @data_import.update!(status: :completed)
47
52
  @broadcaster.success
48
53
  WebhookNotifier.notify(@data_import, "import.completed")
@@ -64,6 +69,25 @@ module DataPorter
64
69
  report.errored_count = results[:errored]
65
70
  @data_import.update!(report: report)
66
71
  end
72
+
73
+ def load_checkpoint
74
+ cp = @data_import.config&.dig("checkpoint") || {}
75
+ {
76
+ processed: cp["processed"].to_i,
77
+ created: cp["created"].to_i,
78
+ errored: cp["errored"].to_i
79
+ }
80
+ end
81
+
82
+ def seed_results(checkpoint)
83
+ { created: checkpoint[:created], errored: checkpoint[:errored] }
84
+ end
85
+
86
+ def clear_checkpoint
87
+ config = @data_import.config || {}
88
+ config.delete("checkpoint")
89
+ @data_import.update_column(:config, config)
90
+ end
67
91
  end
68
92
  end
69
93
  end
@@ -2,12 +2,14 @@
2
2
 
3
3
  require_relative "orchestrator/record_builder"
4
4
  require_relative "orchestrator/importer"
5
+ require_relative "orchestrator/bulk_importer"
5
6
  require_relative "orchestrator/dry_runner"
6
7
 
7
8
  module DataPorter
8
9
  class Orchestrator
9
10
  include RecordBuilder
10
11
  include Importer
12
+ include BulkImporter
11
13
  include DryRunner
12
14
 
13
15
  def initialize(data_import, content: nil)
@@ -30,6 +32,7 @@ module DataPorter
30
32
  def parse!
31
33
  @data_import.parsing!
32
34
  records = build_records
35
+ clear_stale_import_data
33
36
  @data_import.update!(records: records, status: :previewing)
34
37
  build_report
35
38
  WebhookNotifier.notify(@data_import, "import.parsed")
@@ -90,18 +93,36 @@ module DataPorter
90
93
  DataPorter.configuration.context_builder&.call(@data_import)
91
94
  end
92
95
 
93
- def broadcast_progress(current, total)
94
- percentage = ((current.to_f / total) * 100).round
96
+ def broadcast_progress(current, total, results: nil)
95
97
  config = @data_import.config || {}
96
- config["progress"] = { "current" => current, "total" => total, "percentage" => percentage }
98
+ config["progress"] = { "current" => current, "total" => total, "percentage" => pct(current, total) }
99
+ save_checkpoint(config, current, results) if results
97
100
  @data_import.update_column(:config, config)
98
101
  @broadcaster.progress(current, total)
99
102
  end
100
103
 
104
+ def pct(current, total)
105
+ ((current.to_f / total) * 100).round
106
+ end
107
+
108
+ def save_checkpoint(config, processed, results)
109
+ config["checkpoint"] = {
110
+ "processed" => processed,
111
+ "created" => results[:created],
112
+ "errored" => results[:errored]
113
+ }
114
+ end
115
+
116
+ def clear_stale_import_data
117
+ config = @data_import.config || {}
118
+ config.delete("checkpoint")
119
+ config.delete("progress")
120
+ @data_import.config = config
121
+ end
122
+
101
123
  def handle_failure(error)
102
- report = StoreModels::Report.new(
103
- error_reports: [StoreModels::Error.new(message: error.message)]
104
- )
124
+ report = @data_import.report || StoreModels::Report.new
125
+ report.error_reports = [StoreModels::Error.new(message: error.message)]
105
126
  @data_import.update!(status: :failed, report: report)
106
127
  @broadcaster.failure(error.message)
107
128
  WebhookNotifier.notify(@data_import, "import.failed")
@@ -10,7 +10,8 @@ module DataPorter
10
10
  class << self
11
11
  attr_reader :_label, :_model_name, :_icon, :_sources,
12
12
  :_columns, :_csv_mappings, :_dedup_keys, :_json_root,
13
- :_api_config, :_dry_run_enabled, :_params, :_webhooks
13
+ :_api_config, :_dry_run_enabled, :_params, :_webhooks,
14
+ :_bulk_config
14
15
 
15
16
  def label(value)
16
17
  @_label = value
@@ -82,6 +83,10 @@ module DataPorter
82
83
  @_webhooks << DSL::Webhook.new(url: url, **)
83
84
  end
84
85
 
86
+ def bulk_mode(batch_size: 500, on_conflict: :retry_per_record)
87
+ @_bulk_config = { batch_size: batch_size, on_conflict: on_conflict }
88
+ end
89
+
85
90
  private
86
91
 
87
92
  def auto_register
@@ -108,6 +113,16 @@ module DataPorter
108
113
  raise NotImplementedError
109
114
  end
110
115
 
116
+ def persist_batch(records, context: nil) # rubocop:disable Lint/UnusedMethodArgument
117
+ raise Error, "model_name is required for default persist_batch" unless self.class._model_name
118
+
119
+ now = Time.current
120
+ model_class = self.class._model_name.constantize
121
+ model_class.insert_all(
122
+ records.map { |r| r.data.merge("created_at" => now, "updated_at" => now) }
123
+ )
124
+ end
125
+
111
126
  def after_import(_results, context:); end
112
127
 
113
128
  def on_error(_record, _error, context:); end
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module DataPorter
4
- VERSION = "2.4.0"
4
+ VERSION = "2.6.0"
5
5
  end
data/lib/data_porter.rb CHANGED
@@ -20,6 +20,7 @@ require_relative "data_porter/record_validator"
20
20
  require_relative "data_porter/broadcaster"
21
21
  require_relative "data_porter/webhook_notifier"
22
22
  require_relative "data_porter/orchestrator"
23
+ require_relative "data_porter/auto_mapper"
23
24
  require_relative "data_porter/rejects_csv_builder"
24
25
  require_relative "data_porter/components"
25
26
  require_relative "data_porter/engine"
@@ -37,6 +37,10 @@ DataPorter.configure do |config|
37
37
  # Set to nil to disable auto-purge. Run `rake data_porter:purge` manually or via cron.
38
38
  # config.purge_after = 60.days
39
39
 
40
+ # Bulk import: enable per-target via `bulk_mode` in your Target class.
41
+ # Uses insert_all for 10-100x throughput on large imports.
42
+ # See docs/ADVANCED.md for configuration options.
43
+
40
44
  # HMAC-SHA256 secret for signing webhook payloads.
41
45
  # When set, every webhook request includes an X-DataPorter-Signature header.
42
46
  # Set to nil to disable signing (default).
@@ -5,6 +5,7 @@ class <%= target_class_name %> < DataPorter::Target
5
5
  model_name "<%= model_name %>"
6
6
  icon "fas fa-file-import"
7
7
  sources <%= target_sources %>
8
+ # bulk_mode batch_size: 500, on_conflict: :retry_per_record
8
9
  <% if parsed_columns.any? %>
9
10
 
10
11
  columns do
data/mkdocs.yml CHANGED
@@ -92,7 +92,9 @@ nav:
92
92
  - Targets: TARGETS.md
93
93
  - Sources: SOURCES.md
94
94
  - Column Mapping: MAPPING.md
95
+ - Views & Theming: VIEWS.md
95
96
  - Routes: routes.md
97
+ - Advanced: ADVANCED.md
96
98
  - Roadmap: ROADMAP.md
97
99
  - Changelog: changelog.md
98
100
  - Contributing: contributing.md
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: data_porter
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.4.0
4
+ version: 2.6.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Seryl Lounis
@@ -146,6 +146,7 @@ files:
146
146
  - config/locales/fr.yml
147
147
  - config/routes.rb
148
148
  - lib/data_porter.rb
149
+ - lib/data_porter/auto_mapper.rb
149
150
  - lib/data_porter/broadcaster.rb
150
151
  - lib/data_porter/column_transformer.rb
151
152
  - lib/data_porter/components.rb
@@ -167,6 +168,7 @@ files:
167
168
  - lib/data_porter/dsl/webhook.rb
168
169
  - lib/data_porter/engine.rb
169
170
  - lib/data_porter/orchestrator.rb
171
+ - lib/data_porter/orchestrator/bulk_importer.rb
170
172
  - lib/data_porter/orchestrator/dry_runner.rb
171
173
  - lib/data_porter/orchestrator/importer.rb
172
174
  - lib/data_porter/orchestrator/record_builder.rb