data_porter 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.claude/commands/blog-status.md +10 -0
- data/.claude/commands/blog.md +109 -0
- data/.claude/commands/task-done.md +27 -0
- data/.claude/commands/tm/add-dependency.md +58 -0
- data/.claude/commands/tm/add-subtask.md +79 -0
- data/.claude/commands/tm/add-task.md +81 -0
- data/.claude/commands/tm/analyze-complexity.md +124 -0
- data/.claude/commands/tm/analyze-project.md +100 -0
- data/.claude/commands/tm/auto-implement-tasks.md +100 -0
- data/.claude/commands/tm/command-pipeline.md +80 -0
- data/.claude/commands/tm/complexity-report.md +120 -0
- data/.claude/commands/tm/convert-task-to-subtask.md +74 -0
- data/.claude/commands/tm/expand-all-tasks.md +52 -0
- data/.claude/commands/tm/expand-task.md +52 -0
- data/.claude/commands/tm/fix-dependencies.md +82 -0
- data/.claude/commands/tm/help.md +101 -0
- data/.claude/commands/tm/init-project-quick.md +49 -0
- data/.claude/commands/tm/init-project.md +53 -0
- data/.claude/commands/tm/install-taskmaster.md +118 -0
- data/.claude/commands/tm/learn.md +106 -0
- data/.claude/commands/tm/list-tasks-by-status.md +42 -0
- data/.claude/commands/tm/list-tasks-with-subtasks.md +30 -0
- data/.claude/commands/tm/list-tasks.md +46 -0
- data/.claude/commands/tm/next-task.md +69 -0
- data/.claude/commands/tm/parse-prd-with-research.md +51 -0
- data/.claude/commands/tm/parse-prd.md +52 -0
- data/.claude/commands/tm/project-status.md +67 -0
- data/.claude/commands/tm/quick-install-taskmaster.md +23 -0
- data/.claude/commands/tm/remove-all-subtasks.md +94 -0
- data/.claude/commands/tm/remove-dependency.md +65 -0
- data/.claude/commands/tm/remove-subtask.md +87 -0
- data/.claude/commands/tm/remove-subtasks.md +89 -0
- data/.claude/commands/tm/remove-task.md +110 -0
- data/.claude/commands/tm/setup-models.md +52 -0
- data/.claude/commands/tm/show-task.md +85 -0
- data/.claude/commands/tm/smart-workflow.md +58 -0
- data/.claude/commands/tm/sync-readme.md +120 -0
- data/.claude/commands/tm/tm-main.md +147 -0
- data/.claude/commands/tm/to-cancelled.md +58 -0
- data/.claude/commands/tm/to-deferred.md +50 -0
- data/.claude/commands/tm/to-done.md +47 -0
- data/.claude/commands/tm/to-in-progress.md +39 -0
- data/.claude/commands/tm/to-pending.md +35 -0
- data/.claude/commands/tm/to-review.md +43 -0
- data/.claude/commands/tm/update-single-task.md +122 -0
- data/.claude/commands/tm/update-task.md +75 -0
- data/.claude/commands/tm/update-tasks-from-id.md +111 -0
- data/.claude/commands/tm/validate-dependencies.md +72 -0
- data/.claude/commands/tm/view-models.md +52 -0
- data/.env.example +12 -0
- data/.mcp.json +24 -0
- data/.taskmaster/CLAUDE.md +435 -0
- data/.taskmaster/config.json +44 -0
- data/.taskmaster/docs/prd.txt +2044 -0
- data/.taskmaster/state.json +6 -0
- data/.taskmaster/tasks/task_001.md +19 -0
- data/.taskmaster/tasks/task_002.md +19 -0
- data/.taskmaster/tasks/task_003.md +19 -0
- data/.taskmaster/tasks/task_004.md +19 -0
- data/.taskmaster/tasks/task_005.md +19 -0
- data/.taskmaster/tasks/task_006.md +19 -0
- data/.taskmaster/tasks/task_007.md +19 -0
- data/.taskmaster/tasks/task_008.md +19 -0
- data/.taskmaster/tasks/task_009.md +19 -0
- data/.taskmaster/tasks/task_010.md +19 -0
- data/.taskmaster/tasks/task_011.md +19 -0
- data/.taskmaster/tasks/task_012.md +19 -0
- data/.taskmaster/tasks/task_013.md +19 -0
- data/.taskmaster/tasks/task_014.md +19 -0
- data/.taskmaster/tasks/task_015.md +19 -0
- data/.taskmaster/tasks/task_016.md +19 -0
- data/.taskmaster/tasks/task_017.md +19 -0
- data/.taskmaster/tasks/task_018.md +19 -0
- data/.taskmaster/tasks/task_019.md +19 -0
- data/.taskmaster/tasks/task_020.md +19 -0
- data/.taskmaster/tasks/tasks.json +299 -0
- data/.taskmaster/templates/example_prd.txt +47 -0
- data/.taskmaster/templates/example_prd_rpg.txt +511 -0
- data/CHANGELOG.md +29 -0
- data/CLAUDE.md +65 -0
- data/CODE_OF_CONDUCT.md +10 -0
- data/CONTRIBUTING.md +49 -0
- data/LICENSE +21 -0
- data/README.md +463 -0
- data/Rakefile +12 -0
- data/app/assets/stylesheets/data_porter/application.css +646 -0
- data/app/channels/data_porter/import_channel.rb +10 -0
- data/app/controllers/data_porter/imports_controller.rb +68 -0
- data/app/javascript/data_porter/progress_controller.js +33 -0
- data/app/jobs/data_porter/dry_run_job.rb +12 -0
- data/app/jobs/data_porter/import_job.rb +12 -0
- data/app/jobs/data_porter/parse_job.rb +12 -0
- data/app/models/data_porter/data_import.rb +49 -0
- data/app/views/data_porter/imports/index.html.erb +142 -0
- data/app/views/data_porter/imports/new.html.erb +88 -0
- data/app/views/data_porter/imports/show.html.erb +49 -0
- data/config/database.yml +3 -0
- data/config/routes.rb +12 -0
- data/docs/SPEC.md +2012 -0
- data/docs/UI.md +32 -0
- data/docs/blog/001-why-build-a-data-import-engine.md +166 -0
- data/docs/blog/002-scaffolding-a-rails-engine.md +188 -0
- data/docs/blog/003-configuration-dsl.md +222 -0
- data/docs/blog/004-store-model-jsonb.md +237 -0
- data/docs/blog/005-target-dsl.md +284 -0
- data/docs/blog/006-parsing-csv-sources.md +300 -0
- data/docs/blog/007-orchestrator.md +247 -0
- data/docs/blog/008-actioncable-stimulus.md +376 -0
- data/docs/blog/009-phlex-ui-components.md +446 -0
- data/docs/blog/010-controllers-routing.md +374 -0
- data/docs/blog/011-generators.md +364 -0
- data/docs/blog/012-json-api-sources.md +323 -0
- data/docs/blog/013-testing-rails-engine.md +618 -0
- data/docs/blog/014-dry-run.md +307 -0
- data/docs/blog/015-publishing-retro.md +264 -0
- data/docs/blog/016-erb-view-templates.md +431 -0
- data/docs/blog/017-showcase-final-retro.md +220 -0
- data/docs/blog/BACKLOG.md +8 -0
- data/docs/blog/SERIES.md +154 -0
- data/docs/screenshots/index-with-previewing.jpg +0 -0
- data/docs/screenshots/index.jpg +0 -0
- data/docs/screenshots/modal-new-import.jpg +0 -0
- data/docs/screenshots/preview.jpg +0 -0
- data/lib/data_porter/broadcaster.rb +29 -0
- data/lib/data_porter/components/base.rb +10 -0
- data/lib/data_porter/components/failure_alert.rb +20 -0
- data/lib/data_porter/components/preview_table.rb +54 -0
- data/lib/data_porter/components/progress_bar.rb +33 -0
- data/lib/data_porter/components/results_summary.rb +19 -0
- data/lib/data_porter/components/status_badge.rb +16 -0
- data/lib/data_porter/components/summary_cards.rb +30 -0
- data/lib/data_porter/components.rb +14 -0
- data/lib/data_porter/configuration.rb +25 -0
- data/lib/data_porter/dsl/api_config.rb +25 -0
- data/lib/data_porter/dsl/column.rb +17 -0
- data/lib/data_porter/engine.rb +15 -0
- data/lib/data_porter/orchestrator.rb +141 -0
- data/lib/data_porter/record_validator.rb +32 -0
- data/lib/data_porter/registry.rb +33 -0
- data/lib/data_porter/sources/api.rb +49 -0
- data/lib/data_porter/sources/base.rb +35 -0
- data/lib/data_porter/sources/csv.rb +43 -0
- data/lib/data_porter/sources/json.rb +45 -0
- data/lib/data_porter/sources.rb +20 -0
- data/lib/data_porter/store_models/error.rb +13 -0
- data/lib/data_porter/store_models/import_record.rb +52 -0
- data/lib/data_porter/store_models/report.rb +21 -0
- data/lib/data_porter/target.rb +89 -0
- data/lib/data_porter/type_validator.rb +46 -0
- data/lib/data_porter/version.rb +5 -0
- data/lib/data_porter.rb +32 -0
- data/lib/generators/data_porter/install/install_generator.rb +33 -0
- data/lib/generators/data_porter/install/templates/create_data_porter_imports.rb.erb +21 -0
- data/lib/generators/data_porter/install/templates/initializer.rb +30 -0
- data/lib/generators/data_porter/target/target_generator.rb +44 -0
- data/lib/generators/data_porter/target/templates/target.rb.tt +20 -0
- data/sig/data_porter.rbs +4 -0
- metadata +274 -0
data/docs/UI.md
ADDED
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
#### UI UX
|
|
2
|
+
|
|
3
|
+
- Phlex
|
|
4
|
+
- Tailwind
|
|
5
|
+
|
|
6
|
+
```json
|
|
7
|
+
// tailwind.config.js (dans la gem, au build)
|
|
8
|
+
module.exports = {
|
|
9
|
+
prefix: 'dp-',
|
|
10
|
+
important: '.data-porter',
|
|
11
|
+
content: [
|
|
12
|
+
'./app/views/data_porter/**/*.erb',
|
|
13
|
+
'./lib/data_porter/components/**/*.rb',
|
|
14
|
+
'./app/javascript/data_porter/**/*.js'
|
|
15
|
+
],
|
|
16
|
+
corePlugins: {
|
|
17
|
+
preflight: false // pas de reset global, on ne touche pas au host
|
|
18
|
+
},
|
|
19
|
+
theme: {
|
|
20
|
+
extend: {
|
|
21
|
+
colors: {
|
|
22
|
+
complete: 'var(--dp-color-complete, #16a34a)',
|
|
23
|
+
partial: 'var(--dp-color-partial, #ca8a04)',
|
|
24
|
+
missing: 'var(--dp-color-missing, #dc2626)',
|
|
25
|
+
primary: 'var(--dp-color-primary, #6366f1)',
|
|
26
|
+
}
|
|
27
|
+
}
|
|
28
|
+
}
|
|
29
|
+
}
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
---
|
|
@@ -0,0 +1,166 @@
|
|
|
1
|
+
---
|
|
2
|
+
title: "Building DataPorter #1 — Why build a data import engine?"
|
|
3
|
+
series: "Building DataPorter - A Data Import Engine for Rails"
|
|
4
|
+
part: 1
|
|
5
|
+
tags: [ruby, rails, rails-engine, gem-development, architecture, open-source]
|
|
6
|
+
published: false
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# Why build a data import engine?
|
|
10
|
+
|
|
11
|
+
> Every Rails app eventually needs to import data. Let's stop rewriting the same workflow every time.
|
|
12
|
+
|
|
13
|
+
## Context
|
|
14
|
+
|
|
15
|
+
This is the first article in a series where we build **DataPorter**, a mountable Rails engine for data import workflows, from scratch. We'll go from `bundle gem` to a published rubygem, covering architecture decisions, DSL design, testing strategies, and everything in between.
|
|
16
|
+
|
|
17
|
+
By the end of this series, you'll have a deep understanding of how to build a production-ready Rails engine — and a reusable gem to show for it.
|
|
18
|
+
|
|
19
|
+
## The problem
|
|
20
|
+
|
|
21
|
+
If you've worked on any non-trivial Rails application, you've probably written this code more than once:
|
|
22
|
+
|
|
23
|
+
1. Upload a CSV (or fetch data from an API)
|
|
24
|
+
2. Parse and validate each row
|
|
25
|
+
3. Show the user what's about to be imported
|
|
26
|
+
4. Persist the valid records to the database
|
|
27
|
+
|
|
28
|
+
Maybe it was a guest list for a hotel app. Maybe vendor data for an e-commerce platform. Maybe scraped listings from an external API.
|
|
29
|
+
|
|
30
|
+
The specifics change, but the workflow is always the same. And every time, we rebuild it from scratch: a controller action here, some CSV parsing there, a background job, maybe a progress bar if we're feeling fancy.
|
|
31
|
+
|
|
32
|
+
The result? Scattered import logic across controllers, services, and jobs. No consistency. No reuse. Every new import type means rewriting the same infrastructure.
|
|
33
|
+
|
|
34
|
+
## What we're building
|
|
35
|
+
|
|
36
|
+
DataPorter is a mountable Rails engine that provides the entire import infrastructure. The host app only defines the business part: *what* to import and *how* to persist it.
|
|
37
|
+
|
|
38
|
+
One file, one class, one import type:
|
|
39
|
+
|
|
40
|
+
```ruby
|
|
41
|
+
# app/importers/guests_target.rb
|
|
42
|
+
class GuestsTarget < DataPorter::Target
|
|
43
|
+
label "Guests"
|
|
44
|
+
model Guest
|
|
45
|
+
sources :csv, :json
|
|
46
|
+
|
|
47
|
+
columns do
|
|
48
|
+
column :first_name, type: :string, required: true
|
|
49
|
+
column :last_name, type: :string, required: true
|
|
50
|
+
column :email, type: :email
|
|
51
|
+
column :phone, type: :phone
|
|
52
|
+
end
|
|
53
|
+
|
|
54
|
+
def persist(record, context:)
|
|
55
|
+
Guest.create!(hotel: context.hotel, **record.attributes)
|
|
56
|
+
end
|
|
57
|
+
end
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
That's it. DataPorter handles the rest: file upload, parsing, validation, preview UI, progress tracking, error reporting, and background processing.
|
|
61
|
+
|
|
62
|
+
The workflow is always three steps:
|
|
63
|
+
|
|
64
|
+
```
|
|
65
|
+
Upload / Configure → Preview & Validate → Import
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
## Why not use what already exists?
|
|
69
|
+
|
|
70
|
+
There are existing solutions in the Rails ecosystem. Let's look at the two most common approaches.
|
|
71
|
+
|
|
72
|
+
### The DIY approach
|
|
73
|
+
|
|
74
|
+
Most teams build custom import flows per model. It works, but it doesn't scale. By the third import type, you're copy-pasting controller actions and wishing you had abstracted earlier.
|
|
75
|
+
|
|
76
|
+
### maintenance_tasks
|
|
77
|
+
|
|
78
|
+
Shopify's [maintenance_tasks](https://github.com/Shopify/maintenance_tasks) gem is excellent for one-off data processing scripts. It provides a UI, background processing, and CSV support.
|
|
79
|
+
|
|
80
|
+
But it solves a different problem. It's designed for fire-and-forget maintenance operations, not interactive import workflows.
|
|
81
|
+
|
|
82
|
+
| Aspect | maintenance_tasks | DataPorter |
|
|
83
|
+
|--------|-------------------|------------|
|
|
84
|
+
| Purpose | One-off scripts | Import workflows |
|
|
85
|
+
| Preview before import | No | Yes |
|
|
86
|
+
| Visual validation | No | Yes (complete/partial/missing) |
|
|
87
|
+
| Multi-step workflow | No (fire & forget) | Yes (parse -> preview -> import) |
|
|
88
|
+
| Real-time progress | No | Yes (ActionCable) |
|
|
89
|
+
| Data sources | CSV, ActiveRecord | CSV, JSON, API (extensible) |
|
|
90
|
+
| Auto-generated UI | Parameter form | Dynamic column table |
|
|
91
|
+
|
|
92
|
+
The key difference: DataPorter adds a **human validation step** between parsing and persisting. The user sees exactly what will be imported, with clear status indicators for each row, before anything touches the database.
|
|
93
|
+
|
|
94
|
+
## Architecture overview
|
|
95
|
+
|
|
96
|
+
DataPorter is split into two clear layers:
|
|
97
|
+
|
|
98
|
+
```
|
|
99
|
+
┌─────────────────────────────────────┐
|
|
100
|
+
│ DataPorter (the gem) │
|
|
101
|
+
│ │
|
|
102
|
+
│ Engine, Model, State Machine, │
|
|
103
|
+
│ Sources, Orchestrator, Jobs, │
|
|
104
|
+
│ ActionCable, UI, DSL, Registry, │
|
|
105
|
+
│ Generators │
|
|
106
|
+
└──────────────┬──────────────────────┘
|
|
107
|
+
│ mount + configure + define targets
|
|
108
|
+
┌──────────────┴──────────────────────┐
|
|
109
|
+
│ Host App │
|
|
110
|
+
│ │
|
|
111
|
+
│ Initializer, Target files, │
|
|
112
|
+
│ Auth (parent controller), │
|
|
113
|
+
│ Style overrides (optional) │
|
|
114
|
+
└─────────────────────────────────────┘
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
The gem owns the infrastructure. The host app owns the business logic. This separation is the core design principle we'll follow throughout the series.
|
|
118
|
+
|
|
119
|
+
## The tech stack
|
|
120
|
+
|
|
121
|
+
Here's what we'll use and why:
|
|
122
|
+
|
|
123
|
+
| Dependency | Role | Why |
|
|
124
|
+
|------------|------|-----|
|
|
125
|
+
| **store_model** | Typed JSONB attributes | Store import records as structured data without extra tables |
|
|
126
|
+
| **phlex** | View components | Ruby-native views, easier to test and namespace than ERB |
|
|
127
|
+
| **turbo-rails** | Page updates | Turbo Frames for partial reloads during the import flow |
|
|
128
|
+
| **stimulus** | JS behavior | Progress bar updates via ActionCable |
|
|
129
|
+
| **Tailwind CSS** | Styling | Scoped with `dp-` prefix to avoid host app conflicts |
|
|
130
|
+
|
|
131
|
+
We'll also rely heavily on Rails built-ins: ActiveJob for background processing, ActionCable for real-time updates, ActiveStorage for file uploads, and enum-based state machine for the import lifecycle.
|
|
132
|
+
|
|
133
|
+
## What this series covers
|
|
134
|
+
|
|
135
|
+
Here's the roadmap — each part is a standalone article:
|
|
136
|
+
|
|
137
|
+
1. **Why build a data import engine?** (this article)
|
|
138
|
+
2. **Scaffolding a Rails Engine gem** — gem structure, Engine setup
|
|
139
|
+
3. **Configuration DSL** — making the gem flexible
|
|
140
|
+
4. **StoreModel & JSONB** — modeling import data without extra tables
|
|
141
|
+
5. **Target DSL** — one file = one import type
|
|
142
|
+
6. **CSV parsing with Sources** — the first end-to-end flow
|
|
143
|
+
7. **The Orchestrator** — coordinating parse and import
|
|
144
|
+
8. **ActionCable & Stimulus** — real-time progress
|
|
145
|
+
9. **Phlex & Tailwind UI** — auto-generated preview tables
|
|
146
|
+
10. **Controllers & routing** — engine controllers done right
|
|
147
|
+
11. **Generators** — install in one command
|
|
148
|
+
12. **JSON & API sources** — beyond CSV
|
|
149
|
+
13. **Testing a Rails Engine** — specs for an isolated engine
|
|
150
|
+
14. **Dry Run mode** — validate against the database before importing
|
|
151
|
+
15. **Publishing & retrospective** — from repo to rubygems.org
|
|
152
|
+
|
|
153
|
+
## Recap
|
|
154
|
+
|
|
155
|
+
- Data import is a recurring pattern in Rails apps that deserves a reusable solution
|
|
156
|
+
- DataPorter provides the infrastructure (upload, parse, preview, import) while the host app defines the business logic
|
|
157
|
+
- The 3-step workflow with human validation is what sets it apart from existing tools
|
|
158
|
+
- We're building a proper mountable Rails engine with a clean separation between gem and host app
|
|
159
|
+
|
|
160
|
+
## Next up
|
|
161
|
+
|
|
162
|
+
In the next article, we'll run `bundle gem data_porter`, set up the Rails Engine with `isolate_namespace`, structure our directories, and configure the gemspec with our dependencies. We'll make our first architectural decisions — and explain why they matter.
|
|
163
|
+
|
|
164
|
+
---
|
|
165
|
+
|
|
166
|
+
*This is part 1 of the series "Building DataPorter - A Data Import Engine for Rails". [Next: Scaffolding a Rails Engine gem](#)*
|
|
@@ -0,0 +1,188 @@
|
|
|
1
|
+
---
|
|
2
|
+
title: "Building DataPorter #2 — Scaffolding a Rails Engine gem"
|
|
3
|
+
series: "Building DataPorter - A Data Import Engine for Rails"
|
|
4
|
+
part: 2
|
|
5
|
+
tags: [ruby, rails, rails-engine, gem-development, architecture]
|
|
6
|
+
published: false
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# Scaffolding a Rails Engine gem
|
|
10
|
+
|
|
11
|
+
> Running `bundle gem` is just the beginning. Here's how to turn a blank gem into a proper Rails Engine with isolated namespacing, auto-loading, and a clean dependency story.
|
|
12
|
+
|
|
13
|
+
## Context
|
|
14
|
+
|
|
15
|
+
In [Part 1](/docs/blog/001-why-build-a-data-import-engine.md), we defined the problem DataPorter solves: a reusable, multi-step import workflow for Rails apps. We talked about architecture, the tech stack, and what sets this gem apart from existing solutions.
|
|
16
|
+
|
|
17
|
+
Now it's time to write code. In this article, we'll scaffold the gem, wire up the Rails Engine, and make our first real decisions about directory structure, namespacing, and dependencies.
|
|
18
|
+
|
|
19
|
+
## The problem
|
|
20
|
+
|
|
21
|
+
`bundle gem data_porter` gives you a perfectly valid Ruby gem. But it gives you a Ruby gem, not a Rails Engine. There's no `config/routes.rb`, no autoloading of models or controllers, no way for a host app to mount your gem at a path. A raw gem doesn't know anything about Rails.
|
|
22
|
+
|
|
23
|
+
Turning that skeleton into a mountable engine means answering a few questions upfront: How do we isolate our namespace so we don't collide with the host app? How do we structure directories so Rails autoloading works inside the gem? Which dependencies do we declare, and how tightly do we pin them?
|
|
24
|
+
|
|
25
|
+
Get these wrong and you'll be fighting Rails conventions for the rest of the project. Get them right and everything else just works.
|
|
26
|
+
|
|
27
|
+
## What we're building
|
|
28
|
+
|
|
29
|
+
By the end of this article, the gem will have a working Engine, an isolated namespace, auto-discovery of host app import targets, and a clean gemspec. Here's the directory tree we're aiming for:
|
|
30
|
+
|
|
31
|
+
```
|
|
32
|
+
data_porter/
|
|
33
|
+
config/
|
|
34
|
+
routes.rb # Engine routes (empty for now)
|
|
35
|
+
lib/
|
|
36
|
+
data_porter.rb # Entry point: requires + module-level API
|
|
37
|
+
data_porter/
|
|
38
|
+
version.rb # Version constant
|
|
39
|
+
engine.rb # The Rails Engine
|
|
40
|
+
configuration.rb # Config object (teased here, detailed in Part 3)
|
|
41
|
+
spec/
|
|
42
|
+
data_porter/
|
|
43
|
+
engine_spec.rb # Engine specs
|
|
44
|
+
data_porter.gemspec # Dependencies and metadata
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
This is deliberately minimal. We'll add `app/models`, `app/controllers`, and `app/views` directories as we build those layers in later articles. Rails Engine autoloading picks them up automatically once they exist, so there's no reason to create empty folders now.
|
|
48
|
+
|
|
49
|
+
## Implementation
|
|
50
|
+
|
|
51
|
+
### Step 1 -- The Engine class
|
|
52
|
+
|
|
53
|
+
The Engine is the single most important file in a Rails Engine gem. It's the bridge between your gem and the host application's Rails stack.
|
|
54
|
+
|
|
55
|
+
```ruby
|
|
56
|
+
# lib/data_porter/engine.rb
|
|
57
|
+
module DataPorter
|
|
58
|
+
class Engine < ::Rails::Engine
|
|
59
|
+
isolate_namespace DataPorter
|
|
60
|
+
|
|
61
|
+
config.to_prepare do
|
|
62
|
+
Dir[Rails.root.join("app/importers/*_target.rb")].each { |f| require f }
|
|
63
|
+
end
|
|
64
|
+
end
|
|
65
|
+
end
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
Three things happen in these few lines, and each one matters.
|
|
69
|
+
|
|
70
|
+
First, we inherit from `::Rails::Engine`. The leading `::` ensures we reference the top-level `Rails` module, not anything that might be nested under `DataPorter`. This is the line that tells Rails "this gem provides engine functionality" -- routes, autoloading, migrations, the works.
|
|
71
|
+
|
|
72
|
+
Second, `isolate_namespace DataPorter` is the key architectural decision. It tells Rails to scope everything the engine provides -- routes, models, controllers -- under the `DataPorter` module. Without it, a `DataImport` model in the engine would collide with a `DataImport` model in the host app. Isolation means our table names become `data_porter_data_imports`, our routes get prefixed, and the host app stays clean.
|
|
73
|
+
|
|
74
|
+
Third, the `config.to_prepare` block handles auto-discovery. We want host apps to define import targets in `app/importers/` using a simple naming convention: `*_target.rb`. The `to_prepare` callback runs before each request in development (so changes are picked up without a restart) and once at boot in production. This is the standard Rails pattern for loading code that needs to exist before the app serves requests.
|
|
75
|
+
|
|
76
|
+
### Step 2 -- The entry point
|
|
77
|
+
|
|
78
|
+
Every gem needs a single file that bootstraps everything. Ours is `lib/data_porter.rb`, and it sets the order of operations.
|
|
79
|
+
|
|
80
|
+
```ruby
|
|
81
|
+
# lib/data_porter.rb
|
|
82
|
+
require "rails/engine"
|
|
83
|
+
require_relative "data_porter/version"
|
|
84
|
+
require_relative "data_porter/configuration"
|
|
85
|
+
require_relative "data_porter/engine"
|
|
86
|
+
|
|
87
|
+
module DataPorter
|
|
88
|
+
class Error < StandardError; end
|
|
89
|
+
|
|
90
|
+
def self.configuration
|
|
91
|
+
@configuration ||= Configuration.new
|
|
92
|
+
end
|
|
93
|
+
|
|
94
|
+
def self.configure
|
|
95
|
+
yield(configuration)
|
|
96
|
+
end
|
|
97
|
+
end
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
The require order is intentional. We load `rails/engine` first because `DataPorter::Engine` inherits from it. Then the version (needed by the gemspec), then configuration (needed by the module-level API), then the engine itself.
|
|
101
|
+
|
|
102
|
+
The `DataPorter` module also defines the gem's public API surface. `DataPorter.configure` with a block is the pattern host apps will use to customize behavior. `DataPorter::Error` gives us a base exception class to inherit from in later parts. We'll dive deep into the configuration DSL in Part 3 -- for now, the important thing is that the scaffolding is in place.
|
|
103
|
+
|
|
104
|
+
Notice that we use `require_relative` for our own files and `require` for external dependencies. This is a minor but useful convention: `require_relative` is resolved from the current file's location, so it works regardless of how the gem is loaded. `require` goes through the Ruby load path, which is correct for third-party gems.
|
|
105
|
+
|
|
106
|
+
### Step 3 -- The gemspec
|
|
107
|
+
|
|
108
|
+
The gemspec declares what the gem needs to run and how it should be packaged. Let's look at the dependency section, which is where the real decisions live.
|
|
109
|
+
|
|
110
|
+
```ruby
|
|
111
|
+
# data_porter.gemspec
|
|
112
|
+
Gem::Specification.new do |spec|
|
|
113
|
+
spec.name = "data_porter"
|
|
114
|
+
spec.version = DataPorter::VERSION
|
|
115
|
+
spec.required_ruby_version = ">= 3.2.0"
|
|
116
|
+
|
|
117
|
+
# ...metadata...
|
|
118
|
+
|
|
119
|
+
spec.add_dependency "rails", ">= 7.0"
|
|
120
|
+
spec.add_dependency "store_model", ">= 2.0"
|
|
121
|
+
spec.add_dependency "phlex", ">= 1.0"
|
|
122
|
+
spec.add_dependency "turbo-rails", ">= 1.0"
|
|
123
|
+
end
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
Four runtime dependencies, each chosen for a specific reason.
|
|
127
|
+
|
|
128
|
+
**rails >= 7.0** is the floor. We use features like `enum` improvements and `config.to_prepare` patterns that are stable from Rails 7 onward. No upper bound -- we trust semver and test against multiple Rails versions in CI.
|
|
129
|
+
|
|
130
|
+
**store_model >= 2.0** lets us define typed, validatable Ruby objects backed by JSONB columns. We'll use it to model import records, errors, and reports without creating extra database tables. Part 4 covers this in detail.
|
|
131
|
+
|
|
132
|
+
**phlex >= 1.0** gives us Ruby-native view components. We chose it over ViewComponent because Phlex components are plain Ruby classes -- easier to namespace, easier to test, no template files to manage inside a gem.
|
|
133
|
+
|
|
134
|
+
**turbo-rails >= 1.0** provides Turbo Frames and Turbo Streams. The import workflow is a natural fit for Turbo: upload in one frame, preview in another, progress updates via streams.
|
|
135
|
+
|
|
136
|
+
All version constraints use `>=` with a floor, not `~>` with a pessimistic lock. This is deliberate for a library gem. A host app should control the exact versions through its `Gemfile.lock`. If we locked `phlex ~> 1.0`, we'd block any host app that wants to use Phlex 2.x. The trade-off is that we need CI coverage across versions, but that's a solvable problem (and we'll set that up in a later article).
|
|
137
|
+
|
|
138
|
+
Also worth noting: the `files` array in the gemspec uses `git ls-files` and explicitly excludes test files, bin scripts, and CI config. The published gem should only include what's needed to run it.
|
|
139
|
+
|
|
140
|
+
## Decisions and tradeoffs
|
|
141
|
+
|
|
142
|
+
| Decision | We chose | Over | Because |
|
|
143
|
+
|----------|----------|------|---------|
|
|
144
|
+
| Namespace isolation | `isolate_namespace` | Full engine (no isolation) | Prevents model/route collisions with host app |
|
|
145
|
+
| Version constraints | `>= floor` (optimistic) | `~> x.y` (pessimistic) | Library gems should not lock transitive dependencies tightly |
|
|
146
|
+
| Target auto-discovery | `config.to_prepare` with Dir glob | Explicit registration API | Convention over configuration; zero boilerplate for host apps |
|
|
147
|
+
| View layer | Phlex | ERB / ViewComponent | Pure Ruby classes are easier to namespace and test inside a gem |
|
|
148
|
+
| Require strategy | `require_relative` for internal files | `require` for everything | Resolves from file location, works regardless of load path state |
|
|
149
|
+
|
|
150
|
+
## Testing it
|
|
151
|
+
|
|
152
|
+
We can verify the engine is wired up correctly with a focused spec.
|
|
153
|
+
|
|
154
|
+
```ruby
|
|
155
|
+
# spec/data_porter/engine_spec.rb
|
|
156
|
+
RSpec.describe DataPorter::Engine do
|
|
157
|
+
it "is a Rails::Engine" do
|
|
158
|
+
expect(described_class.superclass).to eq(Rails::Engine)
|
|
159
|
+
end
|
|
160
|
+
|
|
161
|
+
it "has an isolated namespace" do
|
|
162
|
+
expect(described_class.isolated?).to be true
|
|
163
|
+
end
|
|
164
|
+
|
|
165
|
+
it "is namespaced under DataPorter" do
|
|
166
|
+
expect(described_class.engine_name).to eq("data_porter")
|
|
167
|
+
end
|
|
168
|
+
end
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
These three assertions confirm the fundamentals: the class hierarchy is correct, isolation is active, and the engine name Rails derives matches what we expect. The `isolated?` check is particularly important -- if someone accidentally removes the `isolate_namespace` line, this test catches it immediately.
|
|
172
|
+
|
|
173
|
+
## Recap
|
|
174
|
+
|
|
175
|
+
- A Rails Engine is what turns a Ruby gem into something a Rails app can mount, with routes, autoloading, and migrations.
|
|
176
|
+
- `isolate_namespace` prevents name collisions between the engine and the host app. It's a one-liner, but it shapes the entire architecture.
|
|
177
|
+
- The entry point (`lib/data_porter.rb`) sets up the require order and defines the gem's public API surface.
|
|
178
|
+
- Gemspec dependencies should use optimistic version constraints (`>=`) so host apps stay in control of their dependency tree.
|
|
179
|
+
- The `config.to_prepare` callback gives us zero-config auto-discovery of import targets in the host app.
|
|
180
|
+
|
|
181
|
+
## Next up
|
|
182
|
+
|
|
183
|
+
We have a working engine, but it's not configurable yet. In Part 3, we'll build the configuration DSL -- the `DataPorter.configure` block that lets host apps customize parent controllers, queue names, storage backends, and more. We'll look at why a dedicated configuration object beats scattering settings across `Rails.application.config`, and how to design defaults that make sense out of the box.
|
|
184
|
+
|
|
185
|
+
---
|
|
186
|
+
|
|
187
|
+
*This is part 2 of the series "Building DataPorter - A Data Import Engine for Rails". [Previous: Why build a data import engine?](/docs/blog/001-why-build-a-data-import-engine.md) | [Next: Configuration DSL](#)*
|
|
188
|
+
*Code: [GitHub](https://github.com/SerylLns/data_porter)*
|
|
@@ -0,0 +1,222 @@
|
|
|
1
|
+
---
|
|
2
|
+
title: "Building DataPorter #3 — Configuration DSL: making the gem flexible"
|
|
3
|
+
series: "Building DataPorter - A Data Import Engine for Rails"
|
|
4
|
+
part: 3
|
|
5
|
+
tags: [ruby, rails, rails-engine, gem-development, dsl, configuration, singleton-pattern]
|
|
6
|
+
published: false
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# Configuration DSL: making the gem flexible
|
|
10
|
+
|
|
11
|
+
> A gem that can't adapt to its host app will never leave your own repo.
|
|
12
|
+
|
|
13
|
+
## Context
|
|
14
|
+
|
|
15
|
+
This is part 3 of the series where we build **DataPorter**, a mountable Rails engine for data import workflows. In [part 1](#), we established the problem and architecture. Part 2 covered scaffolding the engine gem with `isolate_namespace`.
|
|
16
|
+
|
|
17
|
+
In this article, we'll build the configuration layer: a clean DSL that lets host apps customize DataPorter's behavior through an initializer. By the end, you'll have a `DataPorter.configure` block that feels like any well-designed Rails gem.
|
|
18
|
+
|
|
19
|
+
## The problem
|
|
20
|
+
|
|
21
|
+
Our engine needs to run inside apps we don't control. One app uses Sidekiq with a custom queue name. Another stores files on S3. A third needs every import scoped to the current hotel's account.
|
|
22
|
+
|
|
23
|
+
Hard-coding any of these choices inside the gem would make it useless to anyone whose setup differs from ours. But making *everything* configurable turns the gem into a configuration puzzle where nobody remembers what goes where.
|
|
24
|
+
|
|
25
|
+
The challenge is finding the line between flexibility and convention -- and expressing it through an API that feels obvious on first read.
|
|
26
|
+
|
|
27
|
+
## What we're building
|
|
28
|
+
|
|
29
|
+
Here's the end result from the host app's perspective:
|
|
30
|
+
|
|
31
|
+
```ruby
|
|
32
|
+
# config/initializers/data_porter.rb
|
|
33
|
+
DataPorter.configure do |config|
|
|
34
|
+
config.parent_controller = "Admin::BaseController"
|
|
35
|
+
config.queue_name = :low_priority
|
|
36
|
+
config.storage_service = :amazon
|
|
37
|
+
config.preview_limit = 200
|
|
38
|
+
|
|
39
|
+
config.context_builder = ->(controller) {
|
|
40
|
+
{ hotel: controller.current_hotel, user: controller.current_user }
|
|
41
|
+
}
|
|
42
|
+
end
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
This reads exactly like a Devise or Sidekiq initializer. A `configure` block yields a plain object with sensible defaults. If you don't call `configure` at all, everything still works.
|
|
46
|
+
|
|
47
|
+
## Implementation
|
|
48
|
+
|
|
49
|
+
### Step 1 -- The Configuration class
|
|
50
|
+
|
|
51
|
+
We need an object that holds every configurable value and provides reasonable defaults out of the box. The simplest approach that works: a plain Ruby class with `attr_accessor` and defaults set in `initialize`.
|
|
52
|
+
|
|
53
|
+
```ruby
|
|
54
|
+
# lib/data_porter/configuration.rb
|
|
55
|
+
module DataPorter
|
|
56
|
+
class Configuration
|
|
57
|
+
attr_accessor :parent_controller,
|
|
58
|
+
:queue_name,
|
|
59
|
+
:storage_service,
|
|
60
|
+
:cable_channel_prefix,
|
|
61
|
+
:context_builder,
|
|
62
|
+
:preview_limit,
|
|
63
|
+
:enabled_sources,
|
|
64
|
+
:scope
|
|
65
|
+
|
|
66
|
+
def initialize
|
|
67
|
+
@parent_controller = "ApplicationController"
|
|
68
|
+
@queue_name = :imports
|
|
69
|
+
@storage_service = :local
|
|
70
|
+
@cable_channel_prefix = "data_porter"
|
|
71
|
+
@context_builder = nil
|
|
72
|
+
@preview_limit = 500
|
|
73
|
+
@enabled_sources = %i[csv json api]
|
|
74
|
+
@scope = nil
|
|
75
|
+
end
|
|
76
|
+
end
|
|
77
|
+
end
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
Every attribute has a default that makes the gem work without any initializer. `parent_controller` defaults to `"ApplicationController"` because that exists in every Rails app. `storage_service` defaults to `:local` because that requires zero setup. `preview_limit` caps at 500 rows to keep the preview page responsive.
|
|
81
|
+
|
|
82
|
+
Two attributes default to `nil` on purpose: `context_builder` and `scope`. These are opt-in features. When `context_builder` is nil, imports run without host-specific context. When `scope` is nil, the engine shows all imports. The gem checks for nil and adapts its behavior, rather than forcing a value that might be wrong.
|
|
83
|
+
|
|
84
|
+
Let's walk through the attributes that deserve a closer look.
|
|
85
|
+
|
|
86
|
+
**`parent_controller`** is a string, not a class. That's deliberate. At configuration time (during boot), the host app's controller class might not be loaded yet. We store the string and `constantize` it later, when Rails actually needs to resolve the inheritance chain.
|
|
87
|
+
|
|
88
|
+
**`context_builder`** is the most interesting one. It's a lambda that receives the current controller instance and returns whatever the host app needs during import. This is how a multi-tenant app passes `current_hotel` into the import flow without the gem knowing anything about hotels. We'll use this extensively when we build the Orchestrator in part 7.
|
|
89
|
+
|
|
90
|
+
**`enabled_sources`** lets the host app restrict which source types appear in the UI. If you only deal with CSV files, you can set `enabled_sources = %i[csv]` and the JSON/API options won't clutter the interface.
|
|
91
|
+
|
|
92
|
+
### Step 2 -- The module-level DSL
|
|
93
|
+
|
|
94
|
+
The Configuration class is just a data object. We need two module-level methods to turn it into a DSL: one to access the singleton instance, and one to yield it for configuration.
|
|
95
|
+
|
|
96
|
+
```ruby
|
|
97
|
+
# lib/data_porter.rb
|
|
98
|
+
module DataPorter
|
|
99
|
+
class Error < StandardError; end
|
|
100
|
+
|
|
101
|
+
def self.configuration
|
|
102
|
+
@configuration ||= Configuration.new
|
|
103
|
+
end
|
|
104
|
+
|
|
105
|
+
def self.configure
|
|
106
|
+
yield(configuration)
|
|
107
|
+
end
|
|
108
|
+
end
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
`configuration` uses memoization via `||=` to ensure a single instance across the application. The first call creates a `Configuration.new` with all defaults; subsequent calls return the same object. This is the singleton pattern without the ceremony of the `Singleton` module.
|
|
112
|
+
|
|
113
|
+
`configure` yields that singleton to a block. Since `attr_accessor` creates both reader and writer methods, the block can set any attribute directly. After the block runs, `DataPorter.configuration.queue_name` returns whatever the host app set -- or the default if they didn't touch it.
|
|
114
|
+
|
|
115
|
+
There's no `reset!` method in production code. We don't need one. The configuration is set once during Rails boot and stays put for the process lifetime. We do need to reset between tests, but that's handled with `instance_variable_set` in the spec (we'll see that in a moment).
|
|
116
|
+
|
|
117
|
+
### Step 3 -- Wiring it up on boot
|
|
118
|
+
|
|
119
|
+
The configuration module gets required early, before the engine loads. This ensures `DataPorter.configure` is available when the host app's initializer runs.
|
|
120
|
+
|
|
121
|
+
```ruby
|
|
122
|
+
# lib/data_porter.rb (top of file)
|
|
123
|
+
require "rails/engine"
|
|
124
|
+
require_relative "data_porter/version"
|
|
125
|
+
require_relative "data_porter/configuration"
|
|
126
|
+
require_relative "data_porter/engine"
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
The load order matters. `configuration.rb` comes before `engine.rb` because the Engine class might reference configuration values during setup. In practice, Rails processes initializers after the engine is loaded, so the host app's `configure` block runs with the full gem already available.
|
|
130
|
+
|
|
131
|
+
Other parts of the gem read configuration like this:
|
|
132
|
+
|
|
133
|
+
```ruby
|
|
134
|
+
# Inside any DataPorter class
|
|
135
|
+
DataPorter.configuration.queue_name
|
|
136
|
+
DataPorter.configuration.context_builder&.call(controller)
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
The safe navigation operator (`&.`) on `context_builder` handles the nil default gracefully. When no builder is configured, the call simply returns nil instead of raising a NoMethodError.
|
|
140
|
+
|
|
141
|
+
## Decisions & tradeoffs
|
|
142
|
+
|
|
143
|
+
| Decision | We chose | Over | Because |
|
|
144
|
+
|----------|----------|------|---------|
|
|
145
|
+
| Configuration object | Plain class with `attr_accessor` | `OpenStruct`, `Dry::Configurable` | No dependencies, easy to read, easy to document; IDE autocompletion works with real attributes |
|
|
146
|
+
| Singleton pattern | Memoized module instance variable | `Singleton` module, `Rails.application.config` | Simpler API (`DataPorter.configure`), no coupling to Rails config namespace, works in non-Rails test contexts |
|
|
147
|
+
| `context_builder` type | Lambda | Block stored as proc, method object | Lambdas enforce arity (catches wrong argument count), and the syntax `->() {}` signals "this is a callable" to the reader |
|
|
148
|
+
| `parent_controller` type | String | Class constant | Avoids load-order issues; the class may not exist at configuration time, but the string can be `constantize`d later |
|
|
149
|
+
| Default for optional features | `nil` | Null object pattern | Simpler to check `if context_builder` than to create a no-op null object; the gem has few enough optional features to keep the nil checks manageable |
|
|
150
|
+
|
|
151
|
+
## Testing it
|
|
152
|
+
|
|
153
|
+
The specs verify two things: that defaults are sane, and that the `configure` block actually mutates the singleton.
|
|
154
|
+
|
|
155
|
+
```ruby
|
|
156
|
+
# spec/data_porter/configuration_spec.rb
|
|
157
|
+
RSpec.describe DataPorter::Configuration do
|
|
158
|
+
subject(:config) { described_class.new }
|
|
159
|
+
|
|
160
|
+
it "has default parent_controller" do
|
|
161
|
+
expect(config.parent_controller).to eq("ApplicationController")
|
|
162
|
+
end
|
|
163
|
+
|
|
164
|
+
it "has default queue_name" do
|
|
165
|
+
expect(config.queue_name).to eq(:imports)
|
|
166
|
+
end
|
|
167
|
+
|
|
168
|
+
it "has default preview_limit" do
|
|
169
|
+
expect(config.preview_limit).to eq(500)
|
|
170
|
+
end
|
|
171
|
+
|
|
172
|
+
it "has nil context_builder by default" do
|
|
173
|
+
expect(config.context_builder).to be_nil
|
|
174
|
+
end
|
|
175
|
+
end
|
|
176
|
+
```
|
|
177
|
+
|
|
178
|
+
Notice that each default gets its own test. This is intentional. When someone changes a default six months from now, the failure message says exactly which default broke, not just "configuration test failed."
|
|
179
|
+
|
|
180
|
+
The module-level specs test the singleton behavior and the `configure` yield pattern:
|
|
181
|
+
|
|
182
|
+
```ruby
|
|
183
|
+
# spec/data_porter/configuration_spec.rb
|
|
184
|
+
RSpec.describe DataPorter do
|
|
185
|
+
describe ".configure" do
|
|
186
|
+
after { DataPorter.instance_variable_set(:@configuration, nil) }
|
|
187
|
+
|
|
188
|
+
it "yields the configuration" do
|
|
189
|
+
DataPorter.configure do |config|
|
|
190
|
+
config.queue_name = :custom_queue
|
|
191
|
+
end
|
|
192
|
+
|
|
193
|
+
expect(DataPorter.configuration.queue_name).to eq(:custom_queue)
|
|
194
|
+
end
|
|
195
|
+
end
|
|
196
|
+
|
|
197
|
+
describe ".configuration" do
|
|
198
|
+
after { DataPorter.instance_variable_set(:@configuration, nil) }
|
|
199
|
+
|
|
200
|
+
it "memoizes the configuration" do
|
|
201
|
+
expect(DataPorter.configuration).to be(DataPorter.configuration)
|
|
202
|
+
end
|
|
203
|
+
end
|
|
204
|
+
end
|
|
205
|
+
```
|
|
206
|
+
|
|
207
|
+
The `after` block resets the singleton between tests using `instance_variable_set`. This is the one place where we reach into internals, and it's acceptable because test isolation trumps encapsulation here. A public `reset!` method would leak test concerns into production code.
|
|
208
|
+
|
|
209
|
+
## Recap
|
|
210
|
+
|
|
211
|
+
- The `Configuration` class is a plain Ruby object with `attr_accessor` and defaults in `initialize`. No framework magic, no dependencies.
|
|
212
|
+
- Two module-level methods (`configure` and `configuration`) create the DSL that host apps use in their initializer.
|
|
213
|
+
- Sensible defaults mean the gem works with zero configuration. `context_builder` and `scope` are opt-in via nil defaults.
|
|
214
|
+
- Storing `parent_controller` as a string avoids boot-order issues. Using a lambda for `context_builder` enforces arity and reads clearly.
|
|
215
|
+
|
|
216
|
+
## Next up
|
|
217
|
+
|
|
218
|
+
Configuration tells the gem *how* to behave. In part 4, we'll tackle *what* it operates on: the data models. We'll use StoreModel and JSONB columns to store import records, validation errors, and summary reports as structured data inside a single table -- no migration per import type, no schema sprawl. If you've ever debated "extra table vs. JSON column," that's the one to read.
|
|
219
|
+
|
|
220
|
+
---
|
|
221
|
+
|
|
222
|
+
*This is part 3 of the series "Building DataPorter - A Data Import Engine for Rails". [Previous: Scaffolding a Rails Engine gem](#) | [Next: Modeling import data with StoreModel & JSONB](#)*
|