data_porter 0.9.0 → 1.0.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +48 -0
- data/README.md +5 -1
- data/app/assets/javascripts/data_porter/import_form_controller.js +1 -0
- data/app/assets/javascripts/data_porter/template_form_controller.js +31 -8
- data/app/assets/stylesheets/data_porter/alerts.css +2 -1
- data/app/assets/stylesheets/data_porter/layout.css +2 -2
- data/app/controllers/data_porter/concerns/import_validation.rb +29 -0
- data/app/controllers/data_porter/concerns/mapping_management.rb +13 -4
- data/app/controllers/data_porter/imports_controller.rb +28 -4
- data/app/views/data_porter/imports/show.html.erb +4 -0
- data/config/routes.rb +1 -0
- data/lib/data_porter/components/preview/results_summary.rb +6 -1
- data/lib/data_porter/configuration.rb +7 -1
- data/lib/data_porter/orchestrator/importer.rb +27 -0
- data/lib/data_porter/orchestrator/record_builder.rb +9 -0
- data/lib/data_porter/registry.rb +7 -1
- data/lib/data_porter/rejects_csv_builder.rb +35 -0
- data/lib/data_porter/sources/base.rb +6 -0
- data/lib/data_porter/sources/csv.rb +32 -5
- data/lib/data_porter/sources/xlsx.rb +2 -1
- data/lib/data_porter/version.rb +1 -1
- data/lib/data_porter.rb +1 -0
- data/lib/generators/data_porter/install/templates/create_data_porter_imports.rb.erb +1 -1
- data/lib/generators/data_porter/install/templates/initializer.rb +3 -5
- metadata +5 -11
- data/docs/CONFIGURATION.md +0 -103
- data/docs/MAPPING.md +0 -44
- data/docs/ROADMAP.md +0 -28
- data/docs/SOURCES.md +0 -94
- data/docs/TARGETS.md +0 -227
- data/docs/screenshots/index-with-previewing.jpg +0 -0
- data/docs/screenshots/index.jpg +0 -0
- data/docs/screenshots/mapping.jpg +0 -0
- data/docs/screenshots/modal-new-import.jpg +0 -0
- data/docs/screenshots/preview.jpg +0 -0
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 7ca6bfabfc9f831d71c60a1942516a5dccf95c85e3787f16a1217188c9feb3a0
|
|
4
|
+
data.tar.gz: f703da9261612953fcacad2674e38bef3037b804191e7fb3087577846b096461
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: a7d8ad32cb5d80e027d9adfe2a089a84703c6e8e5b00901c0d057a4b2bb24cb2ffbe0f6edb53f61021219fd03384830517192657fbe4c693dd74b6977279b22a
|
|
7
|
+
data.tar.gz: 6d96ecefa39d191cea801ff8e4075f5c9bb4e13979f7754041db33a6685701d98f4521230500a1d0a47a417ce141f1b08973b5cbfd65868b0c3920c99afb61f6
|
data/CHANGELOG.md
CHANGED
|
@@ -5,6 +5,54 @@ All notable changes to this project will be documented in this file.
|
|
|
5
5
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
|
6
6
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
7
|
|
|
8
|
+
## [1.0.2] - 2026-02-07
|
|
9
|
+
|
|
10
|
+
### Changed
|
|
11
|
+
|
|
12
|
+
- Exclude `docs/` from gem package (194 KB → 80 KB)
|
|
13
|
+
|
|
14
|
+
## [1.0.1] - 2026-02-07
|
|
15
|
+
|
|
16
|
+
### Added
|
|
17
|
+
|
|
18
|
+
- **CSV delimiter auto-detection** -- Automatically detect `,` `;` `\t` separators via frequency analysis on the first line; explicit `col_sep` config still takes precedence
|
|
19
|
+
- **CSV encoding auto-detection** -- Detect and transcode Latin-1 / ISO-8859-1 content to UTF-8; strip UTF-8 BOM when present
|
|
20
|
+
|
|
21
|
+
### Fixed
|
|
22
|
+
|
|
23
|
+
- **`param.collection` accepts arrays** -- `Registry.serialize_param` now duck-types with `respond_to?(:call)` so both lambdas and plain arrays work
|
|
24
|
+
- **`dp-input` styling** -- Text inputs now share the same CSS rules as `dp-select` and `dp-file-input`
|
|
25
|
+
- **Migration template nullable user** -- Removed `null: false` from polymorphic `user` reference so the engine works without authentication
|
|
26
|
+
- **Skipped records visible in results** -- Added "Skipped" stat card for missing + partial records; title reflects errors; export rejects button includes all rejected rows
|
|
27
|
+
- **Hidden param label removed** -- `type: :hidden` params no longer render a label or wrapper div
|
|
28
|
+
|
|
29
|
+
### Changed
|
|
30
|
+
|
|
31
|
+
- 402 RSpec examples (up from 391), 0 failures
|
|
32
|
+
|
|
33
|
+
## [1.0.0] - 2026-02-07
|
|
34
|
+
|
|
35
|
+
### Added
|
|
36
|
+
|
|
37
|
+
- **Max records guard** -- `config.max_records` (default: 10,000) rejects files exceeding the limit before parsing
|
|
38
|
+
- **Transaction mode** -- `config.transaction_mode` (`:per_record` or `:all`); `:all` wraps entire import in a single transaction that rolls back on any failure
|
|
39
|
+
- **Fallback headers** -- Auto-generate `col_1, col_2...` when CSV/XLSX header row is empty
|
|
40
|
+
- **Reject rows CSV export** -- Download CSV of failed/errored records with original data + error messages after import; available when `errored_count > 0`
|
|
41
|
+
- **E2E specs** -- 6 end-to-end integration tests covering all source types (CSV, XLSX, JSON, API), import params, and reject rows export
|
|
42
|
+
|
|
43
|
+
### Fixed
|
|
44
|
+
|
|
45
|
+
- **Import params whitelist** -- `merge_import_params` now permits only param names declared in the Target DSL instead of using `permit!`
|
|
46
|
+
- **Column mapping whitelist** -- `permitted_column_mapping` filters mapping values to valid target column names; invalid values replaced with `""`
|
|
47
|
+
- **File size validation** -- Uploads exceeding `config.max_file_size` (default: 10 MB) are rejected before save
|
|
48
|
+
- **MIME type validation** -- Uploaded files must match allowed content types per source (CSV: `text/csv`, `text/plain`; JSON: `application/json`, `text/plain`; XLSX: OpenXML spreadsheet)
|
|
49
|
+
- **XSS in template form** -- Replaced `innerHTML` with safe DOM methods in `template_form_controller.js`
|
|
50
|
+
|
|
51
|
+
### Changed
|
|
52
|
+
|
|
53
|
+
- Validation chain refactored to `all_validations_pass?` using `.all?` to collect all errors at once instead of short-circuiting
|
|
54
|
+
- 391 RSpec examples (up from 354), 0 failures
|
|
55
|
+
|
|
8
56
|
## [0.9.0] - 2026-02-07
|
|
9
57
|
|
|
10
58
|
### Added
|
data/README.md
CHANGED
|
@@ -30,6 +30,9 @@ Supports CSV, JSON, XLSX, and API sources with a declarative DSL for defining im
|
|
|
30
30
|
- **Import params** -- Declare extra form fields (select, text, number, hidden) per target for scoped imports ([docs](docs/TARGETS.md#params--))
|
|
31
31
|
- **Per-target source filtering** -- Each target declares its allowed sources, the UI filters accordingly
|
|
32
32
|
- **Import deletion & auto-purge** -- Delete imports from the UI, or schedule `rake data_porter:purge` for automatic cleanup
|
|
33
|
+
- **Reject rows export** -- Download a CSV of failed/errored records with error messages after import
|
|
34
|
+
- **Security validations** -- File size limit, MIME type check, strong parameter whitelisting
|
|
35
|
+
- **Safety guards** -- Max records limit (`config.max_records`), configurable transaction mode (`:per_record` or `:all`)
|
|
33
36
|
- **Declarative Target DSL** -- One class per import type, zero boilerplate ([docs](docs/TARGETS.md))
|
|
34
37
|
|
|
35
38
|
## Requirements
|
|
@@ -129,6 +132,7 @@ pending -> parsing -> previewing -> importing -> completed
|
|
|
129
132
|
| POST | `/imports/:id/confirm` | Run import |
|
|
130
133
|
| POST | `/imports/:id/cancel` | Cancel import |
|
|
131
134
|
| POST | `/imports/:id/dry_run` | Dry run validation |
|
|
135
|
+
| GET | `/imports/:id/export_rejects` | Download rejects CSV |
|
|
132
136
|
| | `/mapping_templates` | Full CRUD for templates |
|
|
133
137
|
|
|
134
138
|
## Development
|
|
@@ -137,7 +141,7 @@ pending -> parsing -> previewing -> importing -> completed
|
|
|
137
141
|
git clone https://github.com/SerylLns/data_porter.git
|
|
138
142
|
cd data_porter
|
|
139
143
|
bin/setup
|
|
140
|
-
bundle exec rspec #
|
|
144
|
+
bundle exec rspec # 391 specs
|
|
141
145
|
bundle exec rubocop # 0 offenses
|
|
142
146
|
```
|
|
143
147
|
|
|
@@ -18,7 +18,8 @@ export default class extends Controller {
|
|
|
18
18
|
const pair = document.createElement("div")
|
|
19
19
|
pair.className = "dp-mapping-pair"
|
|
20
20
|
pair.style.cssText = "display: flex; gap: 0.5rem; margin-bottom: 0.5rem;"
|
|
21
|
-
pair.
|
|
21
|
+
pair.appendChild(this.buildKeyInput())
|
|
22
|
+
pair.appendChild(this.buildValueSelect(columns))
|
|
22
23
|
container.appendChild(pair)
|
|
23
24
|
}
|
|
24
25
|
|
|
@@ -34,13 +35,35 @@ export default class extends Controller {
|
|
|
34
35
|
})
|
|
35
36
|
}
|
|
36
37
|
|
|
37
|
-
|
|
38
|
-
const
|
|
39
|
-
|
|
40
|
-
|
|
38
|
+
buildKeyInput() {
|
|
39
|
+
const input = document.createElement("input")
|
|
40
|
+
input.type = "text"
|
|
41
|
+
input.name = "mapping_template[mapping_keys][]"
|
|
42
|
+
input.placeholder = "File header"
|
|
43
|
+
input.className = "dp-select"
|
|
44
|
+
input.style.flex = "1"
|
|
45
|
+
return input
|
|
46
|
+
}
|
|
47
|
+
|
|
48
|
+
buildValueSelect(columns) {
|
|
49
|
+
const select = document.createElement("select")
|
|
50
|
+
select.name = "mapping_template[mapping_values][]"
|
|
51
|
+
select.className = "dp-select"
|
|
52
|
+
select.style.flex = "1"
|
|
53
|
+
select.dataset.dataPorterTemplateFormTarget = "fieldSelect"
|
|
54
|
+
|
|
55
|
+
const blank = document.createElement("option")
|
|
56
|
+
blank.value = ""
|
|
57
|
+
blank.textContent = "Select a field..."
|
|
58
|
+
select.appendChild(blank)
|
|
59
|
+
|
|
60
|
+
columns.forEach(([label, name]) => {
|
|
61
|
+
const opt = document.createElement("option")
|
|
62
|
+
opt.value = name
|
|
63
|
+
opt.textContent = label
|
|
64
|
+
select.appendChild(opt)
|
|
65
|
+
})
|
|
41
66
|
|
|
42
|
-
return
|
|
43
|
-
`<select name="mapping_template[mapping_values][]" class="dp-select" style="flex: 1;" data-data-porter--template-form-target="fieldSelect">` +
|
|
44
|
-
`<option value="">Select a field...</option>${options}</select>`
|
|
67
|
+
return select
|
|
45
68
|
}
|
|
46
69
|
}
|
|
@@ -32,7 +32,7 @@
|
|
|
32
32
|
display: grid;
|
|
33
33
|
grid-template-columns: repeat(auto-fit, minmax(120px, 1fr));
|
|
34
34
|
gap: 1rem;
|
|
35
|
-
max-width:
|
|
35
|
+
max-width: 500px;
|
|
36
36
|
margin: 0 auto;
|
|
37
37
|
}
|
|
38
38
|
|
|
@@ -60,6 +60,7 @@
|
|
|
60
60
|
|
|
61
61
|
.dp-results__stat--success strong { color: var(--dp-success); }
|
|
62
62
|
.dp-results__stat--error strong { color: var(--dp-danger); }
|
|
63
|
+
.dp-results__stat--warning strong { color: var(--dp-warning); }
|
|
63
64
|
|
|
64
65
|
.dp-results__duration {
|
|
65
66
|
margin-top: 1rem;
|
|
@@ -29,7 +29,7 @@
|
|
|
29
29
|
color: var(--dp-gray-700);
|
|
30
30
|
}
|
|
31
31
|
|
|
32
|
-
.dp-select, .dp-file-input {
|
|
32
|
+
.dp-select, .dp-input, .dp-file-input {
|
|
33
33
|
display: block;
|
|
34
34
|
width: 100%;
|
|
35
35
|
padding: 0.625rem 0.875rem;
|
|
@@ -50,7 +50,7 @@
|
|
|
50
50
|
padding-right: 2.5rem;
|
|
51
51
|
}
|
|
52
52
|
|
|
53
|
-
.dp-select:focus, .dp-file-input:focus {
|
|
53
|
+
.dp-select:focus, .dp-input:focus, .dp-file-input:focus {
|
|
54
54
|
outline: none;
|
|
55
55
|
border-color: var(--dp-primary);
|
|
56
56
|
box-shadow: 0 0 0 3px rgba(79, 70, 229, 0.15);
|
|
@@ -5,6 +5,12 @@ module DataPorter
|
|
|
5
5
|
module ImportValidation
|
|
6
6
|
extend ActiveSupport::Concern
|
|
7
7
|
|
|
8
|
+
ALLOWED_CONTENT_TYPES = {
|
|
9
|
+
"csv" => %w[text/csv text/plain],
|
|
10
|
+
"json" => %w[application/json text/plain],
|
|
11
|
+
"xlsx" => %w[application/vnd.openxmlformats-officedocument.spreadsheetml.sheet]
|
|
12
|
+
}.freeze
|
|
13
|
+
|
|
8
14
|
private
|
|
9
15
|
|
|
10
16
|
def valid_source_for_target?
|
|
@@ -42,6 +48,29 @@ module DataPorter
|
|
|
42
48
|
def import_param_values
|
|
43
49
|
(@import.config || {}).fetch("import_params", {})
|
|
44
50
|
end
|
|
51
|
+
|
|
52
|
+
def valid_file_size?
|
|
53
|
+
return true unless @import.file.attached?
|
|
54
|
+
|
|
55
|
+
max = DataPorter.configuration.max_file_size
|
|
56
|
+
return true if @import.file.blob.byte_size <= max
|
|
57
|
+
|
|
58
|
+
@import.errors.add(:file, "is too large (max #{max / 1.megabyte} MB)")
|
|
59
|
+
false
|
|
60
|
+
end
|
|
61
|
+
|
|
62
|
+
def valid_file_content_type?
|
|
63
|
+
return true unless @import.file.attached?
|
|
64
|
+
|
|
65
|
+
allowed = ALLOWED_CONTENT_TYPES[@import.source_type]
|
|
66
|
+
return true unless allowed
|
|
67
|
+
|
|
68
|
+
content_type = @import.file.blob.content_type
|
|
69
|
+
return true if allowed.include?(content_type)
|
|
70
|
+
|
|
71
|
+
@import.errors.add(:file, "has an invalid content type (#{content_type})")
|
|
72
|
+
false
|
|
73
|
+
end
|
|
45
74
|
end
|
|
46
75
|
end
|
|
47
76
|
end
|
|
@@ -23,8 +23,7 @@ module DataPorter
|
|
|
23
23
|
end
|
|
24
24
|
|
|
25
25
|
def save_column_mapping
|
|
26
|
-
|
|
27
|
-
merged = (@import.config || {}).merge("column_mapping" => mapping)
|
|
26
|
+
merged = (@import.config || {}).merge("column_mapping" => permitted_column_mapping)
|
|
28
27
|
@import.update!(config: merged, status: :pending)
|
|
29
28
|
end
|
|
30
29
|
|
|
@@ -32,11 +31,21 @@ module DataPorter
|
|
|
32
31
|
return unless params[:save_template] == "1"
|
|
33
32
|
return unless defined?(DataPorter::MappingTemplate)
|
|
34
33
|
|
|
35
|
-
mapping = params.require(:column_mapping).permit!.to_h
|
|
36
34
|
DataPorter::MappingTemplate.find_or_initialize_by(
|
|
37
35
|
target_key: @import.target_key,
|
|
38
36
|
name: params[:template_name].presence || "Default"
|
|
39
|
-
).update!(mapping:
|
|
37
|
+
).update!(mapping: permitted_column_mapping)
|
|
38
|
+
end
|
|
39
|
+
|
|
40
|
+
def permitted_column_mapping
|
|
41
|
+
raw = params.require(:column_mapping).permit!.to_h
|
|
42
|
+
valid_names = valid_column_names
|
|
43
|
+
raw.transform_values { |v| valid_names.include?(v) ? v : "" }
|
|
44
|
+
end
|
|
45
|
+
|
|
46
|
+
def valid_column_names
|
|
47
|
+
columns = @import.target_class._columns || []
|
|
48
|
+
columns.to_set { |c| c.name.to_s }
|
|
40
49
|
end
|
|
41
50
|
end
|
|
42
51
|
end
|
|
@@ -8,7 +8,7 @@ module DataPorter
|
|
|
8
8
|
|
|
9
9
|
layout "data_porter/application"
|
|
10
10
|
|
|
11
|
-
before_action :set_import, only: %i[show parse confirm cancel dry_run update_mapping status destroy]
|
|
11
|
+
before_action :set_import, only: %i[show parse confirm cancel dry_run update_mapping status export_rejects destroy]
|
|
12
12
|
before_action :load_targets, only: %i[index new create]
|
|
13
13
|
|
|
14
14
|
def index
|
|
@@ -22,7 +22,7 @@ module DataPorter
|
|
|
22
22
|
def create
|
|
23
23
|
build_import
|
|
24
24
|
|
|
25
|
-
if
|
|
25
|
+
if all_validations_pass? && @import.save
|
|
26
26
|
enqueue_after_create
|
|
27
27
|
redirect_to import_path(@import)
|
|
28
28
|
else
|
|
@@ -73,6 +73,12 @@ module DataPorter
|
|
|
73
73
|
render json: { status: @import.status, progress: progress }
|
|
74
74
|
end
|
|
75
75
|
|
|
76
|
+
def export_rejects
|
|
77
|
+
columns = @import.target_class._columns || []
|
|
78
|
+
csv = RejectsCsvBuilder.new(columns, @import.records).generate
|
|
79
|
+
send_data csv, filename: "rejects_import_#{@import.id}.csv", type: "text/csv"
|
|
80
|
+
end
|
|
81
|
+
|
|
76
82
|
def destroy
|
|
77
83
|
@import.file.purge if @import.file.attached?
|
|
78
84
|
@import.destroy!
|
|
@@ -95,6 +101,16 @@ module DataPorter
|
|
|
95
101
|
@import.status = :pending
|
|
96
102
|
end
|
|
97
103
|
|
|
104
|
+
def all_validations_pass?
|
|
105
|
+
[
|
|
106
|
+
valid_source_for_target?,
|
|
107
|
+
valid_file_presence?,
|
|
108
|
+
valid_file_size?,
|
|
109
|
+
valid_file_content_type?,
|
|
110
|
+
valid_import_params?
|
|
111
|
+
].all?
|
|
112
|
+
end
|
|
113
|
+
|
|
98
114
|
def import_params
|
|
99
115
|
permitted = params.require(:data_import).permit(:target_key, :source_type, :file, config: {})
|
|
100
116
|
merge_import_params(permitted)
|
|
@@ -104,11 +120,19 @@ module DataPorter
|
|
|
104
120
|
nested = params.dig(:data_import, :config, :import_params)
|
|
105
121
|
return permitted unless nested
|
|
106
122
|
|
|
107
|
-
config = permitted[:config] || {}
|
|
108
|
-
config["import_params"] = nested.permit
|
|
123
|
+
config = permitted[:config]&.to_unsafe_h || {}
|
|
124
|
+
config["import_params"] = nested.permit(*allowed_param_keys).to_h
|
|
109
125
|
permitted.merge(config: config)
|
|
110
126
|
end
|
|
111
127
|
|
|
128
|
+
def allowed_param_keys
|
|
129
|
+
target_key = params.dig(:data_import, :target_key)
|
|
130
|
+
return [] unless target_key
|
|
131
|
+
|
|
132
|
+
target = DataPorter::Registry.find(target_key)
|
|
133
|
+
(target._params || []).map { |p| p.name.to_s }
|
|
134
|
+
end
|
|
135
|
+
|
|
112
136
|
def enqueue_after_create
|
|
113
137
|
if @import.file_based?
|
|
114
138
|
DataPorter::ExtractHeadersJob.perform_later(@import.id)
|
|
@@ -86,6 +86,10 @@
|
|
|
86
86
|
<% end %>
|
|
87
87
|
<div class="dp-actions">
|
|
88
88
|
<%= link_to "Back to imports", imports_path, class: "dp-btn dp-btn--primary" %>
|
|
89
|
+
<% rejected = @import.report.errored_count.to_i + @import.report.missing_count.to_i + @import.report.partial_count.to_i %>
|
|
90
|
+
<% if rejected.positive? %>
|
|
91
|
+
<%= link_to "Download rejects CSV", export_rejects_import_path(@import), class: "dp-btn dp-btn--secondary" %>
|
|
92
|
+
<% end %>
|
|
89
93
|
<%= button_to "Delete", import_path(@import),
|
|
90
94
|
method: :delete, class: "dp-btn dp-btn--danger",
|
|
91
95
|
data: { turbo_confirm: "Delete this import?" } %>
|
data/config/routes.rb
CHANGED
|
@@ -33,6 +33,7 @@ module DataPorter
|
|
|
33
33
|
div(class: "dp-results__cards") do
|
|
34
34
|
stat("dp-results__stat--success", @report.imported_count, "Imported")
|
|
35
35
|
stat("dp-results__stat--error", @report.errored_count, "Errors")
|
|
36
|
+
stat("dp-results__stat--warning", skipped_count, "Skipped") if skipped_count.positive?
|
|
36
37
|
end
|
|
37
38
|
end
|
|
38
39
|
|
|
@@ -52,7 +53,11 @@ module DataPorter
|
|
|
52
53
|
end
|
|
53
54
|
|
|
54
55
|
def success?
|
|
55
|
-
@report.errored_count.zero?
|
|
56
|
+
@report.errored_count.zero? && skipped_count.zero?
|
|
57
|
+
end
|
|
58
|
+
|
|
59
|
+
def skipped_count
|
|
60
|
+
@report.missing_count.to_i + @report.partial_count.to_i
|
|
56
61
|
end
|
|
57
62
|
end
|
|
58
63
|
end
|
|
@@ -10,7 +10,10 @@ module DataPorter
|
|
|
10
10
|
:preview_limit,
|
|
11
11
|
:enabled_sources,
|
|
12
12
|
:scope,
|
|
13
|
-
:purge_after
|
|
13
|
+
:purge_after,
|
|
14
|
+
:max_file_size,
|
|
15
|
+
:max_records,
|
|
16
|
+
:transaction_mode
|
|
14
17
|
|
|
15
18
|
def initialize
|
|
16
19
|
@parent_controller = "ApplicationController"
|
|
@@ -22,6 +25,9 @@ module DataPorter
|
|
|
22
25
|
@enabled_sources = %i[csv json api xlsx]
|
|
23
26
|
@scope = nil
|
|
24
27
|
@purge_after = 60.days
|
|
28
|
+
@max_file_size = 10.megabytes
|
|
29
|
+
@max_records = 10_000
|
|
30
|
+
@transaction_mode = :per_record
|
|
25
31
|
end
|
|
26
32
|
end
|
|
27
33
|
end
|
|
@@ -6,6 +6,14 @@ module DataPorter
|
|
|
6
6
|
private
|
|
7
7
|
|
|
8
8
|
def import_records
|
|
9
|
+
if DataPorter.configuration.transaction_mode == :all
|
|
10
|
+
import_all_or_nothing
|
|
11
|
+
else
|
|
12
|
+
import_per_record
|
|
13
|
+
end
|
|
14
|
+
end
|
|
15
|
+
|
|
16
|
+
def import_per_record
|
|
9
17
|
importable = @data_import.importable_records
|
|
10
18
|
context = build_context
|
|
11
19
|
results = { created: 0, errored: 0 }
|
|
@@ -16,6 +24,25 @@ module DataPorter
|
|
|
16
24
|
broadcast_progress(index + 1, total)
|
|
17
25
|
end
|
|
18
26
|
|
|
27
|
+
finalize_import(results)
|
|
28
|
+
end
|
|
29
|
+
|
|
30
|
+
def import_all_or_nothing
|
|
31
|
+
importable = @data_import.importable_records
|
|
32
|
+
context = build_context
|
|
33
|
+
total = importable.size
|
|
34
|
+
|
|
35
|
+
ActiveRecord::Base.transaction do
|
|
36
|
+
importable.each_with_index do |record, index|
|
|
37
|
+
@target.persist(record, context: context)
|
|
38
|
+
broadcast_progress(index + 1, total)
|
|
39
|
+
end
|
|
40
|
+
end
|
|
41
|
+
|
|
42
|
+
finalize_import(created: total, errored: 0)
|
|
43
|
+
end
|
|
44
|
+
|
|
45
|
+
def finalize_import(results)
|
|
19
46
|
@data_import.update!(status: :completed)
|
|
20
47
|
@broadcaster.success
|
|
21
48
|
results
|
|
@@ -8,6 +8,7 @@ module DataPorter
|
|
|
8
8
|
def build_records
|
|
9
9
|
source = build_source
|
|
10
10
|
raw_rows = source.fetch
|
|
11
|
+
enforce_max_records!(raw_rows.size)
|
|
11
12
|
columns = @target.class._columns || []
|
|
12
13
|
validator = RecordValidator.new(columns)
|
|
13
14
|
|
|
@@ -16,6 +17,14 @@ module DataPorter
|
|
|
16
17
|
end
|
|
17
18
|
end
|
|
18
19
|
|
|
20
|
+
def enforce_max_records!(count)
|
|
21
|
+
max = DataPorter.configuration.max_records
|
|
22
|
+
return unless max
|
|
23
|
+
return if count <= max
|
|
24
|
+
|
|
25
|
+
raise Error, "File contains #{count} records, exceeds maximum of #{max}"
|
|
26
|
+
end
|
|
27
|
+
|
|
19
28
|
def build_record(row, index, columns, validator)
|
|
20
29
|
record = StoreModels::ImportRecord.new(
|
|
21
30
|
line_number: index + 1,
|
data/lib/data_porter/registry.rb
CHANGED
|
@@ -37,6 +37,12 @@ module DataPorter
|
|
|
37
37
|
|
|
38
38
|
private
|
|
39
39
|
|
|
40
|
+
def resolve_collection(collection)
|
|
41
|
+
return unless collection
|
|
42
|
+
|
|
43
|
+
collection.respond_to?(:call) ? collection.call : collection
|
|
44
|
+
end
|
|
45
|
+
|
|
40
46
|
def serialize_params(params)
|
|
41
47
|
return [] unless params
|
|
42
48
|
|
|
@@ -50,7 +56,7 @@ module DataPorter
|
|
|
50
56
|
required: param.required,
|
|
51
57
|
label: param.label,
|
|
52
58
|
default: param.default,
|
|
53
|
-
collection: param.collection
|
|
59
|
+
collection: resolve_collection(param.collection)
|
|
54
60
|
}.compact
|
|
55
61
|
end
|
|
56
62
|
end
|
|
@@ -0,0 +1,35 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require "csv"
|
|
4
|
+
|
|
5
|
+
module DataPorter
|
|
6
|
+
class RejectsCsvBuilder
|
|
7
|
+
def initialize(columns, records)
|
|
8
|
+
@columns = columns
|
|
9
|
+
@records = records
|
|
10
|
+
end
|
|
11
|
+
|
|
12
|
+
def generate
|
|
13
|
+
CSV.generate do |csv|
|
|
14
|
+
csv << header_row
|
|
15
|
+
rejected_records.each { |r| csv << record_row(r) }
|
|
16
|
+
end
|
|
17
|
+
end
|
|
18
|
+
|
|
19
|
+
private
|
|
20
|
+
|
|
21
|
+
def header_row
|
|
22
|
+
["line"] + @columns.map { |c| c.name.to_s } + ["errors"]
|
|
23
|
+
end
|
|
24
|
+
|
|
25
|
+
def rejected_records
|
|
26
|
+
@records.reject(&:complete?)
|
|
27
|
+
end
|
|
28
|
+
|
|
29
|
+
def record_row(record)
|
|
30
|
+
values = @columns.map { |c| record.data[c.name.to_s] }
|
|
31
|
+
errors = record.errors_list.map(&:message).join("; ")
|
|
32
|
+
[record.line_number] + values + [errors]
|
|
33
|
+
end
|
|
34
|
+
end
|
|
35
|
+
end
|
|
@@ -45,6 +45,12 @@ module DataPorter
|
|
|
45
45
|
def auto_map(row)
|
|
46
46
|
row.to_h.transform_keys { |k| k.parameterize(separator: "_").to_sym }
|
|
47
47
|
end
|
|
48
|
+
|
|
49
|
+
def fallback_headers(raw_headers)
|
|
50
|
+
return raw_headers if raw_headers.any?(&:present?)
|
|
51
|
+
|
|
52
|
+
raw_headers.each_with_index.map { |_, i| "col_#{i + 1}" }
|
|
53
|
+
end
|
|
48
54
|
end
|
|
49
55
|
end
|
|
50
56
|
end
|
|
@@ -5,6 +5,8 @@ require "csv"
|
|
|
5
5
|
module DataPorter
|
|
6
6
|
module Sources
|
|
7
7
|
class Csv < Base
|
|
8
|
+
SEPARATORS = [",", ";", "\t"].freeze
|
|
9
|
+
|
|
8
10
|
def initialize(data_import, content: nil)
|
|
9
11
|
super(data_import)
|
|
10
12
|
@content = content
|
|
@@ -12,7 +14,8 @@ module DataPorter
|
|
|
12
14
|
|
|
13
15
|
def headers
|
|
14
16
|
first_line = csv_content.lines.first
|
|
15
|
-
::CSV.parse_line(first_line, **extra_options).map(&:to_s)
|
|
17
|
+
raw = ::CSV.parse_line(first_line, **extra_options).map(&:to_s)
|
|
18
|
+
fallback_headers(raw)
|
|
16
19
|
end
|
|
17
20
|
|
|
18
21
|
def fetch
|
|
@@ -26,11 +29,28 @@ module DataPorter
|
|
|
26
29
|
private
|
|
27
30
|
|
|
28
31
|
def csv_content
|
|
29
|
-
@content || download_file
|
|
32
|
+
@csv_content ||= ensure_utf8(@content || download_file)
|
|
30
33
|
end
|
|
31
34
|
|
|
32
35
|
def download_file
|
|
33
|
-
@data_import.file.download
|
|
36
|
+
@data_import.file.download
|
|
37
|
+
end
|
|
38
|
+
|
|
39
|
+
def ensure_utf8(raw)
|
|
40
|
+
raw = strip_bom(raw)
|
|
41
|
+
return raw if raw.encoding == Encoding::UTF_8 && raw.valid_encoding?
|
|
42
|
+
|
|
43
|
+
raw.force_encoding("UTF-8")
|
|
44
|
+
return raw if raw.valid_encoding?
|
|
45
|
+
|
|
46
|
+
raw.encode("UTF-8", "ISO-8859-1")
|
|
47
|
+
end
|
|
48
|
+
|
|
49
|
+
def strip_bom(raw)
|
|
50
|
+
bytes = raw.b
|
|
51
|
+
return raw unless bytes.start_with?("\xEF\xBB\xBF".b)
|
|
52
|
+
|
|
53
|
+
bytes[3..].force_encoding("UTF-8")
|
|
34
54
|
end
|
|
35
55
|
|
|
36
56
|
def csv_options
|
|
@@ -39,9 +59,16 @@ module DataPorter
|
|
|
39
59
|
|
|
40
60
|
def extra_options
|
|
41
61
|
config = @data_import.config
|
|
42
|
-
return {} unless config.is_a?(Hash)
|
|
62
|
+
return { col_sep: detect_separator } unless config.is_a?(Hash)
|
|
63
|
+
|
|
64
|
+
opts = config.symbolize_keys.slice(:col_sep, :encoding)
|
|
65
|
+
opts[:col_sep] ||= detect_separator
|
|
66
|
+
opts
|
|
67
|
+
end
|
|
43
68
|
|
|
44
|
-
|
|
69
|
+
def detect_separator
|
|
70
|
+
first_line = csv_content.lines.first.to_s
|
|
71
|
+
SEPARATORS.max_by { |sep| first_line.count(sep) }
|
|
45
72
|
end
|
|
46
73
|
end
|
|
47
74
|
end
|
data/lib/data_porter/version.rb
CHANGED
data/lib/data_porter.rb
CHANGED
|
@@ -18,6 +18,7 @@ require_relative "data_porter/sources"
|
|
|
18
18
|
require_relative "data_porter/record_validator"
|
|
19
19
|
require_relative "data_porter/broadcaster"
|
|
20
20
|
require_relative "data_porter/orchestrator"
|
|
21
|
+
require_relative "data_porter/rejects_csv_builder"
|
|
21
22
|
require_relative "data_porter/components"
|
|
22
23
|
require_relative "data_porter/engine"
|
|
23
24
|
|
|
@@ -10,7 +10,7 @@ class CreateDataPorterImports < ActiveRecord::Migration[<%= ActiveRecord::Migrat
|
|
|
10
10
|
t.jsonb :report, null: false, default: {}
|
|
11
11
|
t.jsonb :config, null: false, default: {}
|
|
12
12
|
|
|
13
|
-
t.references :user, polymorphic: true
|
|
13
|
+
t.references :user, polymorphic: true
|
|
14
14
|
|
|
15
15
|
t.timestamps
|
|
16
16
|
end
|
|
@@ -15,11 +15,9 @@ DataPorter.configure do |config|
|
|
|
15
15
|
# config.cable_channel_prefix = "data_porter"
|
|
16
16
|
|
|
17
17
|
# Context builder: inject business data into targets.
|
|
18
|
-
# Receives the
|
|
19
|
-
# config.context_builder = ->(
|
|
20
|
-
#
|
|
21
|
-
# user: controller.current_user
|
|
22
|
-
# )
|
|
18
|
+
# Receives the DataImport record.
|
|
19
|
+
# config.context_builder = ->(data_import) {
|
|
20
|
+
# { user: data_import.user }
|
|
23
21
|
# }
|
|
24
22
|
|
|
25
23
|
# Maximum number of records displayed in preview.
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: data_porter
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.
|
|
4
|
+
version: 1.0.2
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Seryl Lounis
|
|
@@ -139,16 +139,6 @@ files:
|
|
|
139
139
|
- app/views/data_porter/mapping_templates/new.html.erb
|
|
140
140
|
- app/views/layouts/data_porter/application.html.erb
|
|
141
141
|
- config/routes.rb
|
|
142
|
-
- docs/CONFIGURATION.md
|
|
143
|
-
- docs/MAPPING.md
|
|
144
|
-
- docs/ROADMAP.md
|
|
145
|
-
- docs/SOURCES.md
|
|
146
|
-
- docs/TARGETS.md
|
|
147
|
-
- docs/screenshots/index-with-previewing.jpg
|
|
148
|
-
- docs/screenshots/index.jpg
|
|
149
|
-
- docs/screenshots/mapping.jpg
|
|
150
|
-
- docs/screenshots/modal-new-import.jpg
|
|
151
|
-
- docs/screenshots/preview.jpg
|
|
152
142
|
- lib/data_porter.rb
|
|
153
143
|
- lib/data_porter/broadcaster.rb
|
|
154
144
|
- lib/data_porter/components.rb
|
|
@@ -174,6 +164,7 @@ files:
|
|
|
174
164
|
- lib/data_porter/orchestrator/record_builder.rb
|
|
175
165
|
- lib/data_porter/record_validator.rb
|
|
176
166
|
- lib/data_porter/registry.rb
|
|
167
|
+
- lib/data_porter/rejects_csv_builder.rb
|
|
177
168
|
- lib/data_porter/sources.rb
|
|
178
169
|
- lib/data_porter/sources/api.rb
|
|
179
170
|
- lib/data_porter/sources/base.rb
|
|
@@ -198,8 +189,11 @@ homepage: https://github.com/SerylLns/data_porter
|
|
|
198
189
|
licenses:
|
|
199
190
|
- MIT
|
|
200
191
|
metadata:
|
|
192
|
+
homepage_uri: https://github.com/SerylLns/data_porter
|
|
201
193
|
source_code_uri: https://github.com/SerylLns/data_porter
|
|
202
194
|
changelog_uri: https://github.com/SerylLns/data_porter/blob/main/CHANGELOG.md
|
|
195
|
+
documentation_uri: https://github.com/SerylLns/data_porter#readme
|
|
196
|
+
bug_tracker_uri: https://github.com/SerylLns/data_porter/issues
|
|
203
197
|
rubygems_mcp_server_uri: https://rubygems.org/gems/data_porter
|
|
204
198
|
rubygems_mfa_required: 'true'
|
|
205
199
|
rdoc_options: []
|
data/docs/CONFIGURATION.md
DELETED
|
@@ -1,103 +0,0 @@
|
|
|
1
|
-
# Configuration
|
|
2
|
-
|
|
3
|
-
All options are set in `config/initializers/data_porter.rb`:
|
|
4
|
-
|
|
5
|
-
```ruby
|
|
6
|
-
DataPorter.configure do |config|
|
|
7
|
-
# Parent controller for the engine's controllers to inherit from.
|
|
8
|
-
# Controls authentication, layouts, and helpers.
|
|
9
|
-
config.parent_controller = "ApplicationController"
|
|
10
|
-
|
|
11
|
-
# ActiveJob queue name for import jobs.
|
|
12
|
-
config.queue_name = :imports
|
|
13
|
-
|
|
14
|
-
# ActiveStorage service for uploaded files.
|
|
15
|
-
config.storage_service = :local
|
|
16
|
-
|
|
17
|
-
# ActionCable channel prefix.
|
|
18
|
-
config.cable_channel_prefix = "data_porter"
|
|
19
|
-
|
|
20
|
-
# Context builder: inject business data into targets.
|
|
21
|
-
# Receives the current controller instance.
|
|
22
|
-
config.context_builder = ->(controller) {
|
|
23
|
-
OpenStruct.new(user: controller.current_user)
|
|
24
|
-
}
|
|
25
|
-
|
|
26
|
-
# Maximum number of records displayed in preview.
|
|
27
|
-
config.preview_limit = 500
|
|
28
|
-
|
|
29
|
-
# Enabled source types.
|
|
30
|
-
config.enabled_sources = %i[csv json api xlsx]
|
|
31
|
-
|
|
32
|
-
# Auto-purge completed/failed imports older than this duration.
|
|
33
|
-
# Set to nil to disable. Run `rake data_porter:purge` manually or via cron.
|
|
34
|
-
config.purge_after = 60.days
|
|
35
|
-
end
|
|
36
|
-
```
|
|
37
|
-
|
|
38
|
-
## Options reference
|
|
39
|
-
|
|
40
|
-
| Option | Default | Description |
|
|
41
|
-
|---|---|---|
|
|
42
|
-
| `parent_controller` | `"ApplicationController"` | Controller class the engine inherits from |
|
|
43
|
-
| `queue_name` | `:imports` | ActiveJob queue for import jobs |
|
|
44
|
-
| `storage_service` | `:local` | ActiveStorage service name |
|
|
45
|
-
| `cable_channel_prefix` | `"data_porter"` | ActionCable stream prefix |
|
|
46
|
-
| `context_builder` | `nil` | Lambda receiving the controller, returns context passed to target methods |
|
|
47
|
-
| `preview_limit` | `500` | Max records shown in the preview step |
|
|
48
|
-
| `enabled_sources` | `%i[csv json api xlsx]` | Source types available in the UI |
|
|
49
|
-
| `purge_after` | `60.days` | Auto-purge completed/failed imports older than this duration |
|
|
50
|
-
|
|
51
|
-
## Authentication
|
|
52
|
-
|
|
53
|
-
The engine inherits authentication from `parent_controller`. Set it to your authenticated base controller:
|
|
54
|
-
|
|
55
|
-
```ruby
|
|
56
|
-
config.parent_controller = "Admin::BaseController"
|
|
57
|
-
```
|
|
58
|
-
|
|
59
|
-
All engine routes will require the same authentication as your base controller.
|
|
60
|
-
|
|
61
|
-
## Context builder
|
|
62
|
-
|
|
63
|
-
The `context_builder` lambda lets you inject business data (current user, tenant, permissions) into target methods (`persist`, `after_import`, `on_error`):
|
|
64
|
-
|
|
65
|
-
```ruby
|
|
66
|
-
config.context_builder = ->(controller) {
|
|
67
|
-
OpenStruct.new(
|
|
68
|
-
user: controller.current_user,
|
|
69
|
-
organization: controller.current_organization
|
|
70
|
-
)
|
|
71
|
-
}
|
|
72
|
-
```
|
|
73
|
-
|
|
74
|
-
The returned object is available as `context` in all target instance methods.
|
|
75
|
-
|
|
76
|
-
## Real-time progress
|
|
77
|
-
|
|
78
|
-
DataPorter tracks import progress via JSON polling. The Stimulus progress controller polls `GET /imports/:id/status` every second and updates an animated progress bar.
|
|
79
|
-
|
|
80
|
-
The status endpoint returns:
|
|
81
|
-
|
|
82
|
-
```json
|
|
83
|
-
{
|
|
84
|
-
"status": "importing",
|
|
85
|
-
"progress": { "current": 42, "total": 100, "percentage": 42 }
|
|
86
|
-
}
|
|
87
|
-
```
|
|
88
|
-
|
|
89
|
-
No ActionCable or WebSocket configuration required -- it works out of the box with any deployment.
|
|
90
|
-
|
|
91
|
-
## Auto-purge
|
|
92
|
-
|
|
93
|
-
Old completed/failed imports can be cleaned up automatically:
|
|
94
|
-
|
|
95
|
-
```bash
|
|
96
|
-
# Run manually
|
|
97
|
-
bin/rails data_porter:purge
|
|
98
|
-
|
|
99
|
-
# Or schedule via cron (e.g. with whenever or solid_queue)
|
|
100
|
-
# Removes imports older than purge_after (default: 60 days)
|
|
101
|
-
```
|
|
102
|
-
|
|
103
|
-
Attached files are purged from ActiveStorage along with the import record.
|
data/docs/MAPPING.md
DELETED
|
@@ -1,44 +0,0 @@
|
|
|
1
|
-
# Column Mapping
|
|
2
|
-
|
|
3
|
-
For file-based sources (CSV/XLSX), DataPorter adds an interactive mapping step between upload and parsing. Users see their file's actual column headers and map each one to a target field via dropdowns.
|
|
4
|
-
|
|
5
|
-
```
|
|
6
|
-
File Header Target Field
|
|
7
|
-
+-----------+ +---------------+
|
|
8
|
-
| Prenom | -> | First Name v |
|
|
9
|
-
+-----------+ +---------------+
|
|
10
|
-
+-----------+ +---------------+
|
|
11
|
-
| Nom | -> | Last Name v |
|
|
12
|
-
+-----------+ +---------------+
|
|
13
|
-
```
|
|
14
|
-
|
|
15
|
-
Dropdowns are pre-filled from the Target's `csv_mapping` when headers match. Users can adjust any mapping before continuing to the preview step.
|
|
16
|
-
|
|
17
|
-
## Required fields
|
|
18
|
-
|
|
19
|
-
Required target fields are marked with `*` in the dropdown labels. If any required field is left unmapped, a warning banner appears listing the missing fields. This validation is client-side only -- it warns but does not block submission.
|
|
20
|
-
|
|
21
|
-
## Duplicate detection
|
|
22
|
-
|
|
23
|
-
If two file headers are mapped to the same target field, the affected rows are highlighted with an orange border and a warning message appears. This helps catch accidental duplicate mappings before parsing.
|
|
24
|
-
|
|
25
|
-
## Mapping Templates
|
|
26
|
-
|
|
27
|
-
Mappings can be saved as reusable templates. When starting a new import, users select a saved template from a dropdown to auto-fill all column mappings at once. Templates are stored per-target, so each import type has its own template library.
|
|
28
|
-
|
|
29
|
-
### Managing templates
|
|
30
|
-
|
|
31
|
-
- **Inline**: Check "Save as template" in the mapping form and give it a name
|
|
32
|
-
- **CRUD**: Use the "Mapping Templates" link on the imports index page to create, edit, and delete templates
|
|
33
|
-
|
|
34
|
-
When a template is loaded, the "Save as template" checkbox is hidden since the user is already working from an existing template.
|
|
35
|
-
|
|
36
|
-
## Mapping Priority
|
|
37
|
-
|
|
38
|
-
When parsing, mappings are resolved in priority order:
|
|
39
|
-
|
|
40
|
-
1. **User mapping** -- from the mapping UI (`config["column_mapping"]`)
|
|
41
|
-
2. **Code mapping** -- from the Target DSL (`csv_mapping`)
|
|
42
|
-
3. **Auto-map** -- parameterize headers to match column names
|
|
43
|
-
|
|
44
|
-
Non-file sources (JSON, API) skip the mapping step entirely.
|
data/docs/ROADMAP.md
DELETED
|
@@ -1,28 +0,0 @@
|
|
|
1
|
-
# Roadmap
|
|
2
|
-
|
|
3
|
-
## v1.0 — Production-ready
|
|
4
|
-
|
|
5
|
-
The goal is a gem that handles real-world imports reliably at scale.
|
|
6
|
-
|
|
7
|
-
### ~~1. Records pagination~~ DONE
|
|
8
|
-
|
|
9
|
-
Implemented in v0.6.0. Preview and completed pages are paginated (50 per page).
|
|
10
|
-
Controller limits records loaded via `RecordPagination` concern.
|
|
11
|
-
|
|
12
|
-
### ~~2. Import params~~ DONE
|
|
13
|
-
|
|
14
|
-
Implemented in v0.9.0. Targets declare `params` with a DSL (`:select`, `:text`,
|
|
15
|
-
`:number`, `:hidden`). Values stored in `config["import_params"]`, accessible
|
|
16
|
-
via `import_params` in all target instance methods. See [Targets docs](TARGETS.md#params--).
|
|
17
|
-
|
|
18
|
-
---
|
|
19
|
-
|
|
20
|
-
## v2+ (future)
|
|
21
|
-
|
|
22
|
-
- Scoped imports (filter index by user/tenant)
|
|
23
|
-
- Webhooks / callbacks on import completion
|
|
24
|
-
- Batch persist (`insert_all` support)
|
|
25
|
-
- Resume / partial retry
|
|
26
|
-
- Scheduled imports (recurring API source)
|
|
27
|
-
- i18n
|
|
28
|
-
- Dashboard stats
|
data/docs/SOURCES.md
DELETED
|
@@ -1,94 +0,0 @@
|
|
|
1
|
-
# Sources
|
|
2
|
-
|
|
3
|
-
DataPorter supports four source types. Each source reads data from a different format and feeds it through the same parsing pipeline.
|
|
4
|
-
|
|
5
|
-
## CSV
|
|
6
|
-
|
|
7
|
-
Upload a CSV file. Headers are extracted automatically and presented in the [column mapping](MAPPING.md) step. Configure header mappings with `csv_mapping` in your [Target](TARGETS.md) when file headers don't match your column names.
|
|
8
|
-
|
|
9
|
-
Custom separator:
|
|
10
|
-
|
|
11
|
-
```ruby
|
|
12
|
-
import.config = { "separator" => ";" }
|
|
13
|
-
```
|
|
14
|
-
|
|
15
|
-
## XLSX
|
|
16
|
-
|
|
17
|
-
Upload an Excel `.xlsx` file. Uses the same `csv_mapping` for header-to-column mapping as CSV. By default the first sheet is parsed; select a different sheet via config:
|
|
18
|
-
|
|
19
|
-
```ruby
|
|
20
|
-
import.config = { "sheet_index" => 1 }
|
|
21
|
-
```
|
|
22
|
-
|
|
23
|
-
Powered by [creek](https://github.com/pythonicrubyist/creek) for streaming, memory-efficient parsing.
|
|
24
|
-
|
|
25
|
-
## JSON
|
|
26
|
-
|
|
27
|
-
Upload a JSON file. Use `json_root` in your Target to specify the path to the records array. Raw JSON arrays are supported without `json_root`.
|
|
28
|
-
|
|
29
|
-
```ruby
|
|
30
|
-
json_root "data.users"
|
|
31
|
-
```
|
|
32
|
-
|
|
33
|
-
Given `{ "data": { "users": [...] } }`, records are extracted from `data.users`.
|
|
34
|
-
|
|
35
|
-
## API
|
|
36
|
-
|
|
37
|
-
Fetch records from an external API endpoint. No file upload is needed -- the engine calls the API directly.
|
|
38
|
-
|
|
39
|
-
### Basic usage
|
|
40
|
-
|
|
41
|
-
```ruby
|
|
42
|
-
api_config do
|
|
43
|
-
endpoint "https://api.example.com/data"
|
|
44
|
-
headers({ "Authorization" => "Bearer token" })
|
|
45
|
-
response_root "results"
|
|
46
|
-
end
|
|
47
|
-
```
|
|
48
|
-
|
|
49
|
-
| Option | Type | Description |
|
|
50
|
-
|---|---|---|
|
|
51
|
-
| `endpoint` | String or Proc | URL to fetch records from |
|
|
52
|
-
| `headers` | Hash or Proc | HTTP headers sent with the request |
|
|
53
|
-
| `response_root` | String | Key in the JSON response containing the records array (omit for top-level arrays) |
|
|
54
|
-
|
|
55
|
-
### Dynamic endpoints and headers
|
|
56
|
-
|
|
57
|
-
Both `endpoint` and `headers` accept lambdas for runtime values. The endpoint lambda receives the import's `config` hash:
|
|
58
|
-
|
|
59
|
-
```ruby
|
|
60
|
-
api_config do
|
|
61
|
-
endpoint ->(params) { "https://api.example.com/events?page=#{params[:page]}" }
|
|
62
|
-
headers -> { { "Authorization" => "Bearer #{ENV['API_TOKEN']}" } }
|
|
63
|
-
response_root "data"
|
|
64
|
-
end
|
|
65
|
-
```
|
|
66
|
-
|
|
67
|
-
### Full example
|
|
68
|
-
|
|
69
|
-
```ruby
|
|
70
|
-
class EventTarget < DataPorter::Target
|
|
71
|
-
label "Events"
|
|
72
|
-
model_name "Event"
|
|
73
|
-
sources :api
|
|
74
|
-
|
|
75
|
-
api_config do
|
|
76
|
-
endpoint "https://api.example.com/events"
|
|
77
|
-
headers -> { { "Authorization" => "Bearer #{ENV['EVENTS_API_KEY']}" } }
|
|
78
|
-
response_root "events"
|
|
79
|
-
end
|
|
80
|
-
|
|
81
|
-
columns do
|
|
82
|
-
column :name, type: :string, required: true
|
|
83
|
-
column :date, type: :date
|
|
84
|
-
column :venue, type: :string
|
|
85
|
-
column :capacity, type: :integer
|
|
86
|
-
end
|
|
87
|
-
|
|
88
|
-
def persist(record, context:)
|
|
89
|
-
Event.create!(record.attributes)
|
|
90
|
-
end
|
|
91
|
-
end
|
|
92
|
-
```
|
|
93
|
-
|
|
94
|
-
When a user creates an import with source type **API**, the engine skips file upload entirely, calls the configured endpoint, parses the JSON response, and feeds the records through the same preview/validate/import pipeline as file-based sources.
|
data/docs/TARGETS.md
DELETED
|
@@ -1,227 +0,0 @@
|
|
|
1
|
-
# Targets
|
|
2
|
-
|
|
3
|
-
Targets are plain Ruby classes in `app/importers/` that inherit from `DataPorter::Target`. Each target defines one import type: its columns, sources, mappings, and persistence logic.
|
|
4
|
-
|
|
5
|
-
## Generator
|
|
6
|
-
|
|
7
|
-
```bash
|
|
8
|
-
bin/rails generate data_porter:target ModelName column:type[:required] ... [--sources csv xlsx]
|
|
9
|
-
```
|
|
10
|
-
|
|
11
|
-
Examples:
|
|
12
|
-
|
|
13
|
-
```bash
|
|
14
|
-
bin/rails generate data_porter:target User email:string:required name:string age:integer --sources csv xlsx
|
|
15
|
-
bin/rails generate data_porter:target Product name:string price:decimal --sources csv
|
|
16
|
-
bin/rails generate data_porter:target Order order_number:string total:decimal
|
|
17
|
-
```
|
|
18
|
-
|
|
19
|
-
Column format: `name:type[:required]`
|
|
20
|
-
|
|
21
|
-
Supported types: `string`, `integer`, `decimal`, `boolean`, `date`.
|
|
22
|
-
|
|
23
|
-
The `--sources` option specifies which source types the target accepts (default: `csv`). The UI will only show these sources when the target is selected.
|
|
24
|
-
|
|
25
|
-
## Class-level DSL
|
|
26
|
-
|
|
27
|
-
```ruby
|
|
28
|
-
class OrderTarget < DataPorter::Target
|
|
29
|
-
label "Orders"
|
|
30
|
-
model_name "Order"
|
|
31
|
-
icon "fas fa-shopping-cart"
|
|
32
|
-
sources :csv, :json, :api, :xlsx
|
|
33
|
-
|
|
34
|
-
columns do
|
|
35
|
-
column :order_number, type: :string, required: true
|
|
36
|
-
column :total, type: :decimal
|
|
37
|
-
column :placed_at, type: :date
|
|
38
|
-
column :active, type: :boolean
|
|
39
|
-
column :quantity, type: :integer
|
|
40
|
-
end
|
|
41
|
-
|
|
42
|
-
csv_mapping do
|
|
43
|
-
map "Order #" => :order_number
|
|
44
|
-
map "Total ($)" => :total
|
|
45
|
-
end
|
|
46
|
-
|
|
47
|
-
json_root "data.orders"
|
|
48
|
-
|
|
49
|
-
api_config do
|
|
50
|
-
endpoint "https://api.example.com/orders"
|
|
51
|
-
headers({ "Authorization" => "Bearer token" })
|
|
52
|
-
response_root "data.orders"
|
|
53
|
-
end
|
|
54
|
-
|
|
55
|
-
deduplicate_by :order_number
|
|
56
|
-
|
|
57
|
-
dry_run_enabled
|
|
58
|
-
|
|
59
|
-
params do
|
|
60
|
-
param :warehouse_id, type: :select, label: "Warehouse", required: true,
|
|
61
|
-
collection: -> { Warehouse.pluck(:name, :id) }
|
|
62
|
-
param :currency, type: :text, default: "USD"
|
|
63
|
-
end
|
|
64
|
-
end
|
|
65
|
-
```
|
|
66
|
-
|
|
67
|
-
### `label(value)`
|
|
68
|
-
|
|
69
|
-
Human-readable name shown in the UI.
|
|
70
|
-
|
|
71
|
-
### `model_name(value)`
|
|
72
|
-
|
|
73
|
-
The ActiveRecord model name this target imports into (for display purposes).
|
|
74
|
-
|
|
75
|
-
### `icon(value)`
|
|
76
|
-
|
|
77
|
-
CSS icon class (e.g. FontAwesome) shown in the UI.
|
|
78
|
-
|
|
79
|
-
### `sources(*types)`
|
|
80
|
-
|
|
81
|
-
Accepted source types: `:csv`, `:json`, `:api`, `:xlsx`.
|
|
82
|
-
|
|
83
|
-
### `columns { ... }`
|
|
84
|
-
|
|
85
|
-
Defines the expected columns for this import. Each column accepts:
|
|
86
|
-
|
|
87
|
-
| Parameter | Type | Default | Description |
|
|
88
|
-
|---|---|---|---|
|
|
89
|
-
| `name` | Symbol | (required) | Column identifier |
|
|
90
|
-
| `type` | Symbol | `:string` | One of `:string`, `:integer`, `:decimal`, `:boolean`, `:date` |
|
|
91
|
-
| `required` | Boolean | `false` | Whether the column must have a value |
|
|
92
|
-
| `label` | String | Humanized name | Display label in the preview |
|
|
93
|
-
|
|
94
|
-
### `csv_mapping { ... }`
|
|
95
|
-
|
|
96
|
-
Maps CSV/XLSX header names to column names when they don't match:
|
|
97
|
-
|
|
98
|
-
```ruby
|
|
99
|
-
csv_mapping do
|
|
100
|
-
map "First Name" => :first_name
|
|
101
|
-
map "E-mail" => :email
|
|
102
|
-
end
|
|
103
|
-
```
|
|
104
|
-
|
|
105
|
-
### `json_root(path)`
|
|
106
|
-
|
|
107
|
-
Dot-separated path to the array of records within a JSON document:
|
|
108
|
-
|
|
109
|
-
```ruby
|
|
110
|
-
json_root "data.users"
|
|
111
|
-
```
|
|
112
|
-
|
|
113
|
-
Given `{ "data": { "users": [...] } }`, records are extracted from `data.users`.
|
|
114
|
-
|
|
115
|
-
### `api_config { ... }`
|
|
116
|
-
|
|
117
|
-
See [Sources: API](SOURCES.md#api) for full documentation.
|
|
118
|
-
|
|
119
|
-
### `deduplicate_by(*keys)`
|
|
120
|
-
|
|
121
|
-
Skip records that share the same value(s) for the given column(s):
|
|
122
|
-
|
|
123
|
-
```ruby
|
|
124
|
-
deduplicate_by :email
|
|
125
|
-
deduplicate_by :first_name, :last_name
|
|
126
|
-
```
|
|
127
|
-
|
|
128
|
-
### `dry_run_enabled`
|
|
129
|
-
|
|
130
|
-
Enables dry run mode for this target. A "Dry Run" button appears in the preview step. Dry run executes the full import pipeline (transform, validate, persist) inside a rolled-back transaction, giving a validation report without modifying the database.
|
|
131
|
-
|
|
132
|
-
### `params { ... }`
|
|
133
|
-
|
|
134
|
-
Declares extra form fields shown when this target is selected in the import form. Values are stored in `config["import_params"]` and accessible via `import_params` in all instance methods.
|
|
135
|
-
|
|
136
|
-
```ruby
|
|
137
|
-
params do
|
|
138
|
-
param :hotel_id, type: :select, label: "Hotel", required: true,
|
|
139
|
-
collection: -> { Hotel.pluck(:name, :id) }
|
|
140
|
-
param :currency, type: :text, label: "Currency", default: "EUR"
|
|
141
|
-
param :batch_size, type: :number, label: "Batch Size", default: "100"
|
|
142
|
-
param :tenant_id, type: :hidden, default: "abc123"
|
|
143
|
-
end
|
|
144
|
-
```
|
|
145
|
-
|
|
146
|
-
Each param accepts:
|
|
147
|
-
|
|
148
|
-
| Parameter | Type | Default | Description |
|
|
149
|
-
|---|---|---|---|
|
|
150
|
-
| `name` | Symbol | (required) | Param identifier |
|
|
151
|
-
| `type` | Symbol | `:text` | One of `:select`, `:text`, `:number`, `:hidden` |
|
|
152
|
-
| `required` | Boolean | `false` | Validated on import creation, shown with `*` in the form |
|
|
153
|
-
| `label` | String | Humanized name | Display label in the form |
|
|
154
|
-
| `default` | String | `nil` | Pre-filled value in the form |
|
|
155
|
-
| `collection` | Lambda | `nil` | For `:select` type -- returns `[[label, value], ...]` |
|
|
156
|
-
|
|
157
|
-
Collection lambdas are evaluated when the form loads, not at boot time. This ensures fresh data (e.g., newly created hotels appear immediately).
|
|
158
|
-
|
|
159
|
-
## Instance Methods
|
|
160
|
-
|
|
161
|
-
### `import_params`
|
|
162
|
-
|
|
163
|
-
Returns a hash of the import params values set by the user in the form. Available in all instance methods (`persist`, `transform`, `validate`, `after_import`, `on_error`). Defaults to `{}` when no params are declared.
|
|
164
|
-
|
|
165
|
-
```ruby
|
|
166
|
-
def persist(record, context:)
|
|
167
|
-
Guest.create!(
|
|
168
|
-
record.attributes.merge(
|
|
169
|
-
hotel_id: import_params["hotel_id"],
|
|
170
|
-
currency: import_params["currency"]
|
|
171
|
-
)
|
|
172
|
-
)
|
|
173
|
-
end
|
|
174
|
-
```
|
|
175
|
-
|
|
176
|
-
Override these in your target to customize behavior.
|
|
177
|
-
|
|
178
|
-
### `transform(record)`
|
|
179
|
-
|
|
180
|
-
Transform a record before validation. Must return the (modified) record.
|
|
181
|
-
|
|
182
|
-
```ruby
|
|
183
|
-
def transform(record)
|
|
184
|
-
record.attributes["email"] = record.attributes["email"]&.downcase
|
|
185
|
-
record
|
|
186
|
-
end
|
|
187
|
-
```
|
|
188
|
-
|
|
189
|
-
### `validate(record)`
|
|
190
|
-
|
|
191
|
-
Add custom validation errors to a record:
|
|
192
|
-
|
|
193
|
-
```ruby
|
|
194
|
-
def validate(record)
|
|
195
|
-
record.add_error("Email is invalid") unless record.attributes["email"]&.include?("@")
|
|
196
|
-
end
|
|
197
|
-
```
|
|
198
|
-
|
|
199
|
-
### `persist(record, context:)`
|
|
200
|
-
|
|
201
|
-
**Required.** Save the record to your database. Raises `NotImplementedError` if not overridden.
|
|
202
|
-
|
|
203
|
-
```ruby
|
|
204
|
-
def persist(record, context:)
|
|
205
|
-
User.create!(record.attributes)
|
|
206
|
-
end
|
|
207
|
-
```
|
|
208
|
-
|
|
209
|
-
### `after_import(results, context:)`
|
|
210
|
-
|
|
211
|
-
Called once after all records have been processed:
|
|
212
|
-
|
|
213
|
-
```ruby
|
|
214
|
-
def after_import(results, context:)
|
|
215
|
-
AdminMailer.import_complete(context.user, results).deliver_later
|
|
216
|
-
end
|
|
217
|
-
```
|
|
218
|
-
|
|
219
|
-
### `on_error(record, error, context:)`
|
|
220
|
-
|
|
221
|
-
Called when a record fails to import:
|
|
222
|
-
|
|
223
|
-
```ruby
|
|
224
|
-
def on_error(record, error, context:)
|
|
225
|
-
Sentry.capture_exception(error, extra: { record: record.attributes })
|
|
226
|
-
end
|
|
227
|
-
```
|
|
Binary file
|
data/docs/screenshots/index.jpg
DELETED
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|