data_porter 0.2.0 → 0.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +49 -0
- data/README.md +60 -393
- data/ROADMAP.md +30 -12
- data/app/assets/javascripts/data_porter/stimulus.min.js +2 -0
- data/app/assets/javascripts/data_porter/turbo.min.js +29 -0
- data/app/assets/stylesheets/data_porter/alerts.css +25 -0
- data/app/assets/stylesheets/data_porter/application.css +12 -646
- data/app/assets/stylesheets/data_porter/badges.css +73 -0
- data/app/assets/stylesheets/data_porter/base.css +56 -0
- data/app/assets/stylesheets/data_porter/cards.css +60 -0
- data/app/assets/stylesheets/data_porter/layout.css +128 -0
- data/app/assets/stylesheets/data_porter/mapping.css +79 -0
- data/app/assets/stylesheets/data_porter/modal.css +49 -0
- data/app/assets/stylesheets/data_porter/preview.css +24 -0
- data/app/assets/stylesheets/data_porter/progress.css +37 -0
- data/app/assets/stylesheets/data_porter/table.css +45 -0
- data/app/controllers/data_porter/imports_controller.rb +74 -10
- data/app/controllers/data_porter/mapping_templates_controller.rb +85 -0
- data/app/javascript/data_porter/mapping_controller.js +86 -0
- data/app/javascript/data_porter/progress_controller.js +1 -1
- data/app/javascript/data_porter/template_form_controller.js +46 -0
- data/app/jobs/data_porter/extract_headers_job.rb +12 -0
- data/app/models/data_porter/data_import.rb +7 -1
- data/app/models/data_porter/mapping_template.rb +15 -0
- data/app/views/data_porter/imports/index.html.erb +8 -7
- data/app/views/data_porter/imports/new.html.erb +9 -3
- data/app/views/data_porter/imports/show.html.erb +41 -13
- data/app/views/data_porter/mapping_templates/_form.html.erb +40 -0
- data/app/views/data_porter/mapping_templates/edit.html.erb +11 -0
- data/app/views/data_porter/mapping_templates/index.html.erb +42 -0
- data/app/views/data_porter/mapping_templates/new.html.erb +11 -0
- data/app/views/layouts/data_porter/application.html.erb +162 -0
- data/config/routes.rb +3 -0
- data/docs/CONFIGURATION.md +81 -0
- data/docs/MAPPING.md +44 -0
- data/docs/SOURCES.md +94 -0
- data/docs/TARGETS.md +176 -0
- data/docs/screenshots/mapping.jpg +0 -0
- data/lib/data_porter/components/mapping/column_row.rb +52 -0
- data/lib/data_porter/components/mapping/form.rb +127 -0
- data/lib/data_porter/components/mapping/template_select.rb +35 -0
- data/lib/data_porter/components/preview/results_summary.rb +21 -0
- data/lib/data_porter/components/preview/summary_cards.rb +32 -0
- data/lib/data_porter/components/preview/table.rb +56 -0
- data/lib/data_porter/components/progress/bar.rb +35 -0
- data/lib/data_porter/components/shared/failure_alert.rb +22 -0
- data/lib/data_porter/components/shared/status_badge.rb +18 -0
- data/lib/data_porter/components.rb +9 -6
- data/lib/data_porter/engine.rb +7 -1
- data/lib/data_porter/orchestrator.rb +21 -1
- data/lib/data_porter/sources/base.rb +18 -3
- data/lib/data_porter/sources/csv.rb +5 -0
- data/lib/data_porter/sources/xlsx.rb +8 -0
- data/lib/data_porter/version.rb +1 -1
- data/lib/generators/data_porter/install/install_generator.rb +4 -0
- data/lib/generators/data_porter/install/templates/create_data_porter_mapping_templates.rb.erb +16 -0
- data/lib/generators/data_porter/install/templates/initializer.rb +1 -1
- metadata +61 -39
- data/lib/data_porter/components/failure_alert.rb +0 -20
- data/lib/data_porter/components/preview_table.rb +0 -54
- data/lib/data_porter/components/progress_bar.rb +0 -33
- data/lib/data_porter/components/results_summary.rb +0 -19
- data/lib/data_porter/components/status_badge.rb +0 -16
- data/lib/data_porter/components/summary_cards.rb +0 -30
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: cf5cdd3a072250ed1b6f8066766f4bb94c0447c80f993de6031f562ef29c969f
|
|
4
|
+
data.tar.gz: c098c9c856c42c7f0cbdb2f0e4efef8fa298f21bac79ecc8abeb8d0c9435160a
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: '08d4f1bc3867112f83a6637129d6e34373e5f5b29913e116cc822fdd150447eb047cac185bb2be5c05180340672ded8c712b8f2e0ec14ca28f71c3c653381898'
|
|
7
|
+
data.tar.gz: a8808881059e7a84fa37042a52999462c27fca0da39bfdc3828c11e4eab88d095617394443baf6974c49e6f9bb0404dd861b561d6f5f6e3c40aafe5018d0633c
|
data/CHANGELOG.md
CHANGED
|
@@ -5,6 +5,55 @@ All notable changes to this project will be documented in this file.
|
|
|
5
5
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
|
6
6
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
7
|
|
|
8
|
+
## [0.4.0] - 2026-02-07
|
|
9
|
+
|
|
10
|
+
### Added
|
|
11
|
+
|
|
12
|
+
- **Standalone engine layout** -- Self-contained HTML layout with importmap, loading Stimulus and Turbo from CDN. The engine no longer depends on the host app's layout or asset pipeline
|
|
13
|
+
- **Turbo Drive** -- All navigation within the engine is now handled by Turbo Drive for instant page transitions
|
|
14
|
+
- **Required field indication** -- Mapping form marks required target fields with `*` and shows a warning listing unmapped required fields
|
|
15
|
+
- **Duplicate mapping detection** -- Visual warning (orange border + message) when two file headers are mapped to the same target field
|
|
16
|
+
- **File validation on create** -- Controller-level validation rejects CSV/JSON/XLSX imports without a file attached, with error message displayed on the form
|
|
17
|
+
- **Hide save-as-template** -- The save-as-template checkbox is hidden when a mapping template is loaded
|
|
18
|
+
- **Import details card** -- Show page displays target, source type, file name, date, and record count
|
|
19
|
+
|
|
20
|
+
### Changed
|
|
21
|
+
|
|
22
|
+
- Mapping templates form rewritten with `<select>` elements for target fields instead of plain text inputs
|
|
23
|
+
- Templates index buttons styled as proper `dp-btn--secondary` / `dp-btn--danger`
|
|
24
|
+
- "Back to imports" moved to header as a button across all pages
|
|
25
|
+
- Progress controller uses `Turbo.visit` instead of `window.location.reload`
|
|
26
|
+
- ActionCable loaded via dynamic import with polling fallback (avoids CDN MIME type issues)
|
|
27
|
+
- Controllers return `422 Unprocessable Entity` on form validation errors for Turbo compatibility
|
|
28
|
+
- Rubocop limits relaxed: ClassLength 150, MethodLength 15
|
|
29
|
+
- 280 RSpec examples (up from 265), 0 failures
|
|
30
|
+
|
|
31
|
+
## [0.3.0] - 2026-02-07
|
|
32
|
+
|
|
33
|
+
### Added
|
|
34
|
+
|
|
35
|
+
- **Interactive column mapping** -- File-based imports (CSV/XLSX) now pause on a mapping step where users match file headers to target fields via dropdowns
|
|
36
|
+
- **Header extraction** -- New `ExtractHeadersJob` reads the first row of a file without parsing all data, with `extracting_headers` and `mapping` statuses
|
|
37
|
+
- **Dynamic mapping priority** -- User mapping (from UI) > code mapping (from Target DSL) > auto-map (parameterized headers)
|
|
38
|
+
- **`#headers` method** on `Sources::Csv` and `Sources::Xlsx` for lightweight first-row extraction
|
|
39
|
+
- **`#file_based?` helper** on `DataImport` to distinguish file sources from structured sources
|
|
40
|
+
- **MappingTemplate model** -- Persist reusable column mappings per target (`data_porter_mapping_templates` table)
|
|
41
|
+
- **MappingTemplatesController** -- Full CRUD for managing saved mapping templates
|
|
42
|
+
- **Mapping Phlex components** -- `Mapping::Form`, `Mapping::ColumnRow`, `Mapping::TemplateSelect` for the mapping UI
|
|
43
|
+
- **Stimulus mapping controller** -- Client-side template loading with zero network requests (reads `data-mapping` JSON attributes)
|
|
44
|
+
- **Save-as-template** -- Checkbox in the mapping form to save the current mapping for future imports
|
|
45
|
+
- **Badge styles** for `extracting_headers` and `mapping` statuses
|
|
46
|
+
- **Install generator** now creates `data_porter_mapping_templates` migration
|
|
47
|
+
|
|
48
|
+
### Changed
|
|
49
|
+
|
|
50
|
+
- CSS split from monolithic `application.css` into 10 domain-specific stylesheets (base, layout, table, badges, cards, preview, progress, alerts, modal, mapping)
|
|
51
|
+
- Phlex components reorganized into subdirectories: `Shared::`, `Preview::`, `Progress::`, `Mapping::`
|
|
52
|
+
- File-based imports route through `ExtractHeadersJob` instead of `ParseJob` on create
|
|
53
|
+
- `Orchestrator` gains `extract_headers!` method and `build_source` / `store_headers` helpers
|
|
54
|
+
- `Sources::Base#apply_csv_mapping` now checks three mapping sources in priority order
|
|
55
|
+
- 265 RSpec examples (up from 225), 0 failures
|
|
56
|
+
|
|
8
57
|
## [0.2.0] - 2026-02-07
|
|
9
58
|
|
|
10
59
|
### Added
|
data/README.md
CHANGED
|
@@ -1,48 +1,54 @@
|
|
|
1
1
|
# DataPorter
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
> [!WARNING]
|
|
4
|
+
> This gem is under active development and not yet production-ready. APIs and features may change without notice.
|
|
4
5
|
|
|
5
|
-
|
|
6
|
-
|
|
7
|
-

|
|
6
|
+
A mountable Rails engine for data import workflows: **Upload**, **Map**, **Preview**, **Import**.
|
|
8
7
|
|
|
9
|
-
|
|
8
|
+
Supports CSV, JSON, XLSX, and API sources with a declarative DSL for defining import targets. Business-agnostic by design -- all domain logic lives in your host app.
|
|
10
9
|
|
|
11
|
-
|
|
10
|
+
<table>
|
|
11
|
+
<tr>
|
|
12
|
+
<td><img src="docs/screenshots/index-with-previewing.jpg" width="400" alt="Import list with status badges" /></td>
|
|
13
|
+
<td><img src="docs/screenshots/modal-new-import.jpg" width="400" alt="New import modal with dropzone" /></td>
|
|
14
|
+
</tr>
|
|
15
|
+
<tr>
|
|
16
|
+
<td><img src="docs/screenshots/mapping.jpg" width="400" alt="Interactive column mapping with templates" /></td>
|
|
17
|
+
<td><img src="docs/screenshots/preview.jpg" width="400" alt="Preview with summary cards and data table" /></td>
|
|
18
|
+
</tr>
|
|
19
|
+
</table>
|
|
20
|
+
|
|
21
|
+
## Features
|
|
22
|
+
|
|
23
|
+
- **4 source types** -- CSV, XLSX, JSON, and API with a unified parsing pipeline
|
|
24
|
+
- **Interactive column mapping** -- Drag-free UI to match file headers to target fields ([docs](docs/MAPPING.md))
|
|
25
|
+
- **Mapping templates** -- Save and reuse column mappings across imports ([docs](docs/MAPPING.md#mapping-templates))
|
|
26
|
+
- **Real-time progress** -- ActionCable updates with polling fallback
|
|
27
|
+
- **Dry run mode** -- Validate against the database without persisting
|
|
28
|
+
- **Standalone UI** -- Self-contained layout with Turbo Drive and Stimulus, no host app dependencies
|
|
29
|
+
- **Declarative Target DSL** -- One class per import type, zero boilerplate ([docs](docs/TARGETS.md))
|
|
12
30
|
|
|
13
31
|
## Requirements
|
|
14
32
|
|
|
15
33
|
- Ruby >= 3.2
|
|
16
34
|
- Rails >= 7.0
|
|
17
|
-
- ActionCable (for real-time progress updates)
|
|
18
35
|
- ActiveStorage (for file uploads)
|
|
36
|
+
- ActionCable (optional, for real-time progress)
|
|
19
37
|
|
|
20
38
|
## Installation
|
|
21
39
|
|
|
22
|
-
Add the gem to your Gemfile:
|
|
23
|
-
|
|
24
40
|
```bash
|
|
25
41
|
bundle add data_porter
|
|
26
|
-
```
|
|
27
|
-
|
|
28
|
-
Run the install generator:
|
|
29
|
-
|
|
30
|
-
```bash
|
|
31
42
|
bin/rails generate data_porter:install
|
|
32
|
-
```
|
|
33
|
-
|
|
34
|
-
This will:
|
|
35
|
-
- Create the migration for `data_porter_imports`
|
|
36
|
-
- Add an initializer at `config/initializers/data_porter.rb`
|
|
37
|
-
- Create the `app/importers/` directory
|
|
38
|
-
- Mount the engine at `/imports`
|
|
39
|
-
|
|
40
|
-
Run the migration:
|
|
41
|
-
|
|
42
|
-
```bash
|
|
43
43
|
bin/rails db:migrate
|
|
44
44
|
```
|
|
45
45
|
|
|
46
|
+
The generator creates:
|
|
47
|
+
- Migrations for `data_porter_imports` and `data_porter_mapping_templates`
|
|
48
|
+
- An initializer at `config/initializers/data_porter.rb`
|
|
49
|
+
- The `app/importers/` directory
|
|
50
|
+
- Engine mount at `/imports`
|
|
51
|
+
|
|
46
52
|
## Quick Start
|
|
47
53
|
|
|
48
54
|
Generate a target:
|
|
@@ -51,11 +57,9 @@ Generate a target:
|
|
|
51
57
|
bin/rails generate data_porter:target Product name:string:required price:integer sku:string
|
|
52
58
|
```
|
|
53
59
|
|
|
54
|
-
Implement
|
|
60
|
+
Implement `persist` in `app/importers/product_target.rb`:
|
|
55
61
|
|
|
56
62
|
```ruby
|
|
57
|
-
# frozen_string_literal: true
|
|
58
|
-
|
|
59
63
|
class ProductTarget < DataPorter::Target
|
|
60
64
|
label "Product"
|
|
61
65
|
model_name "Product"
|
|
@@ -63,7 +67,7 @@ class ProductTarget < DataPorter::Target
|
|
|
63
67
|
sources :csv
|
|
64
68
|
|
|
65
69
|
columns do
|
|
66
|
-
column :name, type: :string,
|
|
70
|
+
column :name, type: :string, required: true
|
|
67
71
|
column :price, type: :integer
|
|
68
72
|
column :sku, type: :string
|
|
69
73
|
end
|
|
@@ -76,377 +80,50 @@ end
|
|
|
76
80
|
|
|
77
81
|
Visit `/imports` and start importing.
|
|
78
82
|
|
|
79
|
-
## Configuration
|
|
80
|
-
|
|
81
|
-
All options are set in `config/initializers/data_porter.rb`:
|
|
82
|
-
|
|
83
|
-
```ruby
|
|
84
|
-
DataPorter.configure do |config|
|
|
85
|
-
# Parent controller for the engine's controllers to inherit from.
|
|
86
|
-
# Controls authentication, layouts, and helpers.
|
|
87
|
-
config.parent_controller = "ApplicationController"
|
|
88
|
-
|
|
89
|
-
# ActiveJob queue name for import jobs.
|
|
90
|
-
config.queue_name = :imports
|
|
91
|
-
|
|
92
|
-
# ActiveStorage service for uploaded files.
|
|
93
|
-
config.storage_service = :local
|
|
94
|
-
|
|
95
|
-
# ActionCable channel prefix.
|
|
96
|
-
config.cable_channel_prefix = "data_porter"
|
|
97
|
-
|
|
98
|
-
# Context builder: inject business data into targets.
|
|
99
|
-
# Receives the current controller instance.
|
|
100
|
-
config.context_builder = ->(controller) {
|
|
101
|
-
OpenStruct.new(user: controller.current_user)
|
|
102
|
-
}
|
|
103
|
-
|
|
104
|
-
# Maximum number of records displayed in preview.
|
|
105
|
-
config.preview_limit = 500
|
|
106
|
-
|
|
107
|
-
# Enabled source types.
|
|
108
|
-
config.enabled_sources = %i[csv json api xlsx]
|
|
109
|
-
end
|
|
110
|
-
```
|
|
111
|
-
|
|
112
|
-
| Option | Default | Description |
|
|
113
|
-
|---|---|---|
|
|
114
|
-
| `parent_controller` | `"ApplicationController"` | Controller class the engine inherits from |
|
|
115
|
-
| `queue_name` | `:imports` | ActiveJob queue for import jobs |
|
|
116
|
-
| `storage_service` | `:local` | ActiveStorage service name |
|
|
117
|
-
| `cable_channel_prefix` | `"data_porter"` | ActionCable stream prefix |
|
|
118
|
-
| `context_builder` | `nil` | Lambda receiving the controller, returns context passed to target methods |
|
|
119
|
-
| `preview_limit` | `500` | Max records shown in the preview step |
|
|
120
|
-
| `enabled_sources` | `%i[csv json api xlsx]` | Source types available in the UI |
|
|
121
|
-
|
|
122
|
-
## Defining Targets
|
|
123
|
-
|
|
124
|
-
Targets are plain Ruby classes in `app/importers/` that inherit from `DataPorter::Target`.
|
|
125
|
-
|
|
126
|
-
### Class-level DSL
|
|
127
|
-
|
|
128
|
-
```ruby
|
|
129
|
-
class OrderTarget < DataPorter::Target
|
|
130
|
-
label "Orders"
|
|
131
|
-
model_name "Order"
|
|
132
|
-
icon "fas fa-shopping-cart"
|
|
133
|
-
sources :csv, :json, :api, :xlsx
|
|
134
|
-
|
|
135
|
-
columns do
|
|
136
|
-
column :order_number, type: :string, required: true
|
|
137
|
-
column :total, type: :decimal
|
|
138
|
-
column :placed_at, type: :date
|
|
139
|
-
column :active, type: :boolean
|
|
140
|
-
column :quantity, type: :integer
|
|
141
|
-
end
|
|
142
|
-
|
|
143
|
-
csv_mapping do
|
|
144
|
-
map "Order #" => :order_number
|
|
145
|
-
map "Total ($)" => :total
|
|
146
|
-
end
|
|
147
|
-
|
|
148
|
-
json_root "data.orders"
|
|
149
|
-
|
|
150
|
-
api_config do
|
|
151
|
-
endpoint "https://api.example.com/orders"
|
|
152
|
-
headers({ "Authorization" => "Bearer token" })
|
|
153
|
-
response_root "data.orders"
|
|
154
|
-
end
|
|
155
|
-
|
|
156
|
-
deduplicate_by :order_number
|
|
157
|
-
|
|
158
|
-
dry_run_enabled
|
|
159
|
-
|
|
160
|
-
# ...
|
|
161
|
-
end
|
|
162
|
-
```
|
|
163
|
-
|
|
164
|
-
#### `label(value)`
|
|
165
|
-
|
|
166
|
-
Human-readable name shown in the UI.
|
|
167
|
-
|
|
168
|
-
#### `model_name(value)`
|
|
169
|
-
|
|
170
|
-
The ActiveRecord model name this target imports into (for display purposes).
|
|
171
|
-
|
|
172
|
-
#### `icon(value)`
|
|
173
|
-
|
|
174
|
-
CSS icon class (e.g. FontAwesome) shown in the UI.
|
|
175
|
-
|
|
176
|
-
#### `sources(*types)`
|
|
177
|
-
|
|
178
|
-
Accepted source types: `:csv`, `:json`, `:api`, `:xlsx`.
|
|
179
|
-
|
|
180
|
-
#### `columns { ... }`
|
|
181
|
-
|
|
182
|
-
Defines the expected columns for this import. Each column accepts:
|
|
183
|
-
|
|
184
|
-
| Parameter | Type | Default | Description |
|
|
185
|
-
|---|---|---|---|
|
|
186
|
-
| `name` | Symbol | (required) | Column identifier |
|
|
187
|
-
| `type` | Symbol | `:string` | One of `:string`, `:integer`, `:decimal`, `:boolean`, `:date` |
|
|
188
|
-
| `required` | Boolean | `false` | Whether the column must have a value |
|
|
189
|
-
| `label` | String | Humanized name | Display label in the preview |
|
|
190
|
-
|
|
191
|
-
#### `csv_mapping { ... }`
|
|
192
|
-
|
|
193
|
-
Maps CSV header names to column names when they don't match:
|
|
194
|
-
|
|
195
|
-
```ruby
|
|
196
|
-
csv_mapping do
|
|
197
|
-
map "First Name" => :first_name
|
|
198
|
-
map "E-mail" => :email
|
|
199
|
-
end
|
|
200
|
-
```
|
|
201
|
-
|
|
202
|
-
#### `json_root(path)`
|
|
203
|
-
|
|
204
|
-
Dot-separated path to the array of records within a JSON document:
|
|
205
|
-
|
|
206
|
-
```ruby
|
|
207
|
-
json_root "data.users"
|
|
208
|
-
```
|
|
209
|
-
|
|
210
|
-
Given `{ "data": { "users": [...] } }`, records are extracted from `data.users`.
|
|
211
|
-
|
|
212
|
-
#### `api_config { ... }`
|
|
213
|
-
|
|
214
|
-
Configures the API source:
|
|
215
|
-
|
|
216
|
-
```ruby
|
|
217
|
-
api_config do
|
|
218
|
-
endpoint "https://api.example.com/records"
|
|
219
|
-
headers({ "Authorization" => "Bearer token", "Accept" => "application/json" })
|
|
220
|
-
response_root "data.items"
|
|
221
|
-
end
|
|
222
|
-
```
|
|
223
|
-
|
|
224
|
-
#### `deduplicate_by(*keys)`
|
|
225
|
-
|
|
226
|
-
Skip records that share the same value(s) for the given column(s):
|
|
227
|
-
|
|
228
|
-
```ruby
|
|
229
|
-
deduplicate_by :email
|
|
230
|
-
deduplicate_by :first_name, :last_name
|
|
231
|
-
```
|
|
232
|
-
|
|
233
|
-
#### `dry_run_enabled`
|
|
234
|
-
|
|
235
|
-
Enables dry run mode for this target (see [Dry Run](#dry-run)).
|
|
236
|
-
|
|
237
|
-
### Instance Methods
|
|
238
|
-
|
|
239
|
-
Override these in your target to customize behavior:
|
|
240
|
-
|
|
241
|
-
#### `transform(record)`
|
|
242
|
-
|
|
243
|
-
Transform a record before validation. Must return the (modified) record.
|
|
244
|
-
|
|
245
|
-
```ruby
|
|
246
|
-
def transform(record)
|
|
247
|
-
record.attributes["email"] = record.attributes["email"]&.downcase
|
|
248
|
-
record
|
|
249
|
-
end
|
|
250
|
-
```
|
|
251
|
-
|
|
252
|
-
#### `validate(record)`
|
|
253
|
-
|
|
254
|
-
Add custom validation errors to a record:
|
|
255
|
-
|
|
256
|
-
```ruby
|
|
257
|
-
def validate(record)
|
|
258
|
-
record.add_error("Email is invalid") unless record.attributes["email"]&.include?("@")
|
|
259
|
-
end
|
|
260
|
-
```
|
|
261
|
-
|
|
262
|
-
#### `persist(record, context:)`
|
|
263
|
-
|
|
264
|
-
**Required.** Save the record to your database. Raises `NotImplementedError` if not overridden.
|
|
265
|
-
|
|
266
|
-
```ruby
|
|
267
|
-
def persist(record, context:)
|
|
268
|
-
User.create!(record.attributes)
|
|
269
|
-
end
|
|
270
|
-
```
|
|
271
|
-
|
|
272
|
-
#### `after_import(results, context:)`
|
|
273
|
-
|
|
274
|
-
Called once after all records have been processed:
|
|
275
|
-
|
|
276
|
-
```ruby
|
|
277
|
-
def after_import(results, context:)
|
|
278
|
-
AdminMailer.import_complete(context.user, results).deliver_later
|
|
279
|
-
end
|
|
280
|
-
```
|
|
281
|
-
|
|
282
|
-
#### `on_error(record, error, context:)`
|
|
283
|
-
|
|
284
|
-
Called when a record fails to import:
|
|
285
|
-
|
|
286
|
-
```ruby
|
|
287
|
-
def on_error(record, error, context:)
|
|
288
|
-
Sentry.capture_exception(error, extra: { record: record.attributes })
|
|
289
|
-
end
|
|
290
|
-
```
|
|
291
|
-
|
|
292
|
-
## Source Types
|
|
293
|
-
|
|
294
|
-
### CSV
|
|
295
|
-
|
|
296
|
-
Upload a CSV file. Configure header mappings with `csv_mapping` when headers don't match your column names.
|
|
297
|
-
|
|
298
|
-
### XLSX
|
|
299
|
-
|
|
300
|
-
Upload an Excel `.xlsx` file. Uses the same `csv_mapping` for header-to-column mapping. By default the first sheet is parsed; select a different sheet via config:
|
|
301
|
-
|
|
302
|
-
```ruby
|
|
303
|
-
import.config = { "sheet_index" => 1 }
|
|
304
|
-
```
|
|
305
|
-
|
|
306
|
-
Powered by [creek](https://github.com/pythonicrubyist/creek) for streaming, memory-efficient parsing.
|
|
307
|
-
|
|
308
|
-
### JSON
|
|
309
|
-
|
|
310
|
-
Upload a JSON file. Use `json_root` to specify the path to the records array. Raw JSON arrays are supported without `json_root`.
|
|
311
|
-
|
|
312
|
-
### API
|
|
313
|
-
|
|
314
|
-
Fetch records from an external API endpoint. No file upload is needed -- the engine calls the API directly.
|
|
315
|
-
|
|
316
|
-
#### Basic usage
|
|
317
|
-
|
|
318
|
-
```ruby
|
|
319
|
-
api_config do
|
|
320
|
-
endpoint "https://api.example.com/data"
|
|
321
|
-
headers({ "Authorization" => "Bearer token" })
|
|
322
|
-
response_root "results"
|
|
323
|
-
end
|
|
324
|
-
```
|
|
325
|
-
|
|
326
|
-
| Option | Type | Description |
|
|
327
|
-
|---|---|---|
|
|
328
|
-
| `endpoint` | String or Proc | URL to fetch records from |
|
|
329
|
-
| `headers` | Hash or Proc | HTTP headers sent with the request |
|
|
330
|
-
| `response_root` | String | Key in the JSON response containing the records array (omit for top-level arrays) |
|
|
331
|
-
|
|
332
|
-
#### Dynamic endpoints and headers
|
|
333
|
-
|
|
334
|
-
Both `endpoint` and `headers` accept lambdas for runtime values. The endpoint lambda receives the import's `config` hash (populated from the form):
|
|
335
|
-
|
|
336
|
-
```ruby
|
|
337
|
-
api_config do
|
|
338
|
-
endpoint ->(params) { "https://api.example.com/events?page=#{params[:page]}" }
|
|
339
|
-
headers -> { { "Authorization" => "Bearer #{ENV['API_TOKEN']}" } }
|
|
340
|
-
response_root "data"
|
|
341
|
-
end
|
|
342
|
-
```
|
|
343
|
-
|
|
344
|
-
#### Example: importing from a paginated API
|
|
345
|
-
|
|
346
|
-
```ruby
|
|
347
|
-
class EventTarget < DataPorter::Target
|
|
348
|
-
label "Events"
|
|
349
|
-
model_name "Event"
|
|
350
|
-
sources :api
|
|
351
|
-
|
|
352
|
-
api_config do
|
|
353
|
-
endpoint "https://api.example.com/events"
|
|
354
|
-
headers -> { { "Authorization" => "Bearer #{ENV['EVENTS_API_KEY']}", "Accept" => "application/json" } }
|
|
355
|
-
response_root "events"
|
|
356
|
-
end
|
|
357
|
-
|
|
358
|
-
columns do
|
|
359
|
-
column :name, type: :string, required: true
|
|
360
|
-
column :date, type: :date
|
|
361
|
-
column :venue, type: :string
|
|
362
|
-
column :capacity, type: :integer
|
|
363
|
-
end
|
|
364
|
-
|
|
365
|
-
def persist(record, context:)
|
|
366
|
-
Event.create!(record.attributes)
|
|
367
|
-
end
|
|
368
|
-
end
|
|
369
|
-
```
|
|
370
|
-
|
|
371
|
-
When a user creates an import with source type **API**, the engine skips file upload entirely, calls the configured endpoint, parses the JSON response, and feeds the records through the same preview/validate/import pipeline as CSV and JSON sources.
|
|
372
|
-
|
|
373
83
|
## Import Workflow
|
|
374
84
|
|
|
375
|
-
Each import progresses through these statuses:
|
|
376
|
-
|
|
377
85
|
```
|
|
86
|
+
File-based (CSV/XLSX):
|
|
87
|
+
pending -> extracting_headers -> mapping -> parsing -> previewing -> importing -> completed
|
|
88
|
+
|
|
89
|
+
Non-file (JSON/API):
|
|
378
90
|
pending -> parsing -> previewing -> importing -> completed
|
|
379
|
-
\-> failed
|
|
380
|
-
pending -> parsing -> dry_running -> previewing
|
|
381
91
|
```
|
|
382
92
|
|
|
383
93
|
| Status | Description |
|
|
384
94
|
|---|---|
|
|
385
|
-
| `pending` |
|
|
386
|
-
| `
|
|
387
|
-
| `
|
|
388
|
-
| `
|
|
95
|
+
| `pending` | Waiting for processing |
|
|
96
|
+
| `extracting_headers` | Reading file headers for column mapping |
|
|
97
|
+
| `mapping` | Waiting for user to map columns |
|
|
98
|
+
| `parsing` | Records being extracted |
|
|
99
|
+
| `previewing` | Records ready for review |
|
|
100
|
+
| `importing` | Records being persisted |
|
|
389
101
|
| `completed` | All records processed |
|
|
390
|
-
| `failed` |
|
|
102
|
+
| `failed` | Fatal error encountered |
|
|
391
103
|
| `dry_running` | Dry run validation in progress |
|
|
392
104
|
|
|
393
|
-
|
|
105
|
+
## Documentation
|
|
394
106
|
|
|
395
|
-
|
|
107
|
+
| Topic | Description |
|
|
108
|
+
|---|---|
|
|
109
|
+
| [Configuration](docs/CONFIGURATION.md) | All options, authentication, context builder, real-time updates |
|
|
110
|
+
| [Targets](docs/TARGETS.md) | DSL reference, columns, hooks, generator |
|
|
111
|
+
| [Sources](docs/SOURCES.md) | CSV, JSON, XLSX, API setup and examples |
|
|
112
|
+
| [Column Mapping](docs/MAPPING.md) | Interactive mapping, templates, priority order |
|
|
113
|
+
|
|
114
|
+
## Routes
|
|
396
115
|
|
|
397
116
|
| Method | Path | Action |
|
|
398
117
|
|---|---|---|
|
|
399
118
|
| GET | `/imports` | List imports |
|
|
400
|
-
| GET | `/imports/new` | New import form |
|
|
401
119
|
| POST | `/imports` | Create import |
|
|
402
120
|
| GET | `/imports/:id` | Show import |
|
|
403
|
-
|
|
|
404
|
-
| POST | `/imports/:id/
|
|
121
|
+
| PATCH | `/imports/:id/update_mapping` | Save column mapping |
|
|
122
|
+
| POST | `/imports/:id/parse` | Parse source |
|
|
123
|
+
| POST | `/imports/:id/confirm` | Run import |
|
|
405
124
|
| POST | `/imports/:id/cancel` | Cancel import |
|
|
406
|
-
| POST | `/imports/:id/dry_run` |
|
|
407
|
-
|
|
408
|
-
## Dry Run
|
|
409
|
-
|
|
410
|
-
When `dry_run_enabled` is declared on a target, a "Dry Run" button appears in the preview step. Dry run executes the full import pipeline (transform, validate, persist) inside a rolled-back transaction, giving you a validation report without modifying the database.
|
|
411
|
-
|
|
412
|
-
## Real-time Updates
|
|
413
|
-
|
|
414
|
-
DataPorter broadcasts import progress via ActionCable. The channel streams on:
|
|
415
|
-
|
|
416
|
-
```
|
|
417
|
-
#{cable_channel_prefix}/imports/#{import_id}
|
|
418
|
-
```
|
|
419
|
-
|
|
420
|
-
The default prefix is `data_porter`, so a typical stream name is `data_porter/imports/42`.
|
|
421
|
-
|
|
422
|
-
The engine ships with a Stimulus controller that automatically subscribes to the channel and updates the UI during parsing and importing.
|
|
423
|
-
|
|
424
|
-
## Generators
|
|
425
|
-
|
|
426
|
-
### `data_porter:install`
|
|
427
|
-
|
|
428
|
-
```bash
|
|
429
|
-
bin/rails generate data_porter:install
|
|
430
|
-
```
|
|
431
|
-
|
|
432
|
-
Sets up the migration, initializer, `app/importers/` directory, and mounts the engine.
|
|
433
|
-
|
|
434
|
-
### `data_porter:target`
|
|
435
|
-
|
|
436
|
-
```bash
|
|
437
|
-
bin/rails generate data_porter:target ModelName column:type[:required] ...
|
|
438
|
-
```
|
|
439
|
-
|
|
440
|
-
Examples:
|
|
441
|
-
|
|
442
|
-
```bash
|
|
443
|
-
bin/rails generate data_porter:target User email:string:required name:string age:integer
|
|
444
|
-
bin/rails generate data_porter:target Product name:string price:decimal
|
|
445
|
-
```
|
|
446
|
-
|
|
447
|
-
Column format: `name:type[:required]`
|
|
448
|
-
|
|
449
|
-
Supported types: `string`, `integer`, `decimal`, `boolean`, `date`.
|
|
125
|
+
| POST | `/imports/:id/dry_run` | Dry run validation |
|
|
126
|
+
| | `/mapping_templates` | Full CRUD for templates |
|
|
450
127
|
|
|
451
128
|
## Development
|
|
452
129
|
|
|
@@ -454,18 +131,8 @@ Supported types: `string`, `integer`, `decimal`, `boolean`, `date`.
|
|
|
454
131
|
git clone https://github.com/SerylLns/data_porter.git
|
|
455
132
|
cd data_porter
|
|
456
133
|
bin/setup
|
|
457
|
-
|
|
458
|
-
|
|
459
|
-
Run the test suite:
|
|
460
|
-
|
|
461
|
-
```bash
|
|
462
|
-
bundle exec rspec
|
|
463
|
-
```
|
|
464
|
-
|
|
465
|
-
Run the linter:
|
|
466
|
-
|
|
467
|
-
```bash
|
|
468
|
-
bundle exec rubocop
|
|
134
|
+
bundle exec rspec # 280 specs
|
|
135
|
+
bundle exec rubocop # 0 offenses
|
|
469
136
|
```
|
|
470
137
|
|
|
471
138
|
## License
|
data/ROADMAP.md
CHANGED
|
@@ -1,20 +1,33 @@
|
|
|
1
1
|
# Roadmap
|
|
2
2
|
|
|
3
|
-
##
|
|
3
|
+
## Completed
|
|
4
4
|
|
|
5
|
-
###
|
|
5
|
+
### v0.2.0 -- XLSX Source
|
|
6
|
+
- ~~Parse `.xlsx` files natively via `creek` gem~~
|
|
7
|
+
- ~~Sheet selector via `config["sheet_index"]`~~
|
|
8
|
+
- ~~Same parsing pipeline as CSV~~
|
|
9
|
+
|
|
10
|
+
### v0.3.0 -- Interactive Column Mapping & Templates
|
|
11
|
+
- ~~Mapping UI: each CSV/XLSX column header gets a dropdown to select the target field~~
|
|
12
|
+
- ~~Save mapping as a reusable template (name + column-to-field pairs)~~
|
|
13
|
+
- ~~Template selector that pre-fills all dropdowns at once~~
|
|
14
|
+
- ~~Stored per-target so each import type has its own template library~~
|
|
15
|
+
- ~~Header extraction step before parsing for file-based sources~~
|
|
16
|
+
- ~~Dynamic mapping priority: user mapping > code mapping > auto-map~~
|
|
17
|
+
|
|
18
|
+
### v0.4.0 -- Standalone Engine UX
|
|
19
|
+
- ~~Self-contained layout with Stimulus + Turbo Drive via CDN importmap~~
|
|
20
|
+
- ~~Required field indication and duplicate mapping detection~~
|
|
21
|
+
- ~~File validation on create for file-based sources~~
|
|
22
|
+
- ~~Turbo Drive for instant page navigation~~
|
|
23
|
+
- ~~Import details card on show page~~
|
|
24
|
+
- ~~Improved template management UI~~
|
|
6
25
|
|
|
7
|
-
|
|
8
|
-
- Parse `.xlsx` files natively (via `creek` or `roo` gem)
|
|
9
|
-
- Sheet selector when the file contains multiple sheets
|
|
10
|
-
- Same parsing pipeline as CSV (prerequisite for column mapping)
|
|
26
|
+
---
|
|
11
27
|
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
- Save mapping as a reusable template (name + column-to-field pairs)
|
|
16
|
-
- Template selector that pre-fills all dropdowns at once
|
|
17
|
-
- Stored per-target so each import type has its own template library
|
|
28
|
+
## Planned
|
|
29
|
+
|
|
30
|
+
### High Priority
|
|
18
31
|
|
|
19
32
|
#### Export (reverse workflow)
|
|
20
33
|
- `ExportTarget` DSL mirroring the import Target
|
|
@@ -43,6 +56,11 @@
|
|
|
43
56
|
column :email, type: :email, transform: ->(v) { v.downcase.strip }
|
|
44
57
|
```
|
|
45
58
|
|
|
59
|
+
#### Auto-suggest Mapping
|
|
60
|
+
- Fuzzy matching between file headers and target columns
|
|
61
|
+
- Suggest mappings based on Levenshtein distance or string similarity
|
|
62
|
+
- Pre-fill dropdowns with best guesses, user confirms
|
|
63
|
+
|
|
46
64
|
#### Diff Mode
|
|
47
65
|
- Compare incoming records with existing database data
|
|
48
66
|
- Show what will be created, updated, or left unchanged
|