data_porter 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (159) hide show
  1. checksums.yaml +7 -0
  2. data/.claude/commands/blog-status.md +10 -0
  3. data/.claude/commands/blog.md +109 -0
  4. data/.claude/commands/task-done.md +27 -0
  5. data/.claude/commands/tm/add-dependency.md +58 -0
  6. data/.claude/commands/tm/add-subtask.md +79 -0
  7. data/.claude/commands/tm/add-task.md +81 -0
  8. data/.claude/commands/tm/analyze-complexity.md +124 -0
  9. data/.claude/commands/tm/analyze-project.md +100 -0
  10. data/.claude/commands/tm/auto-implement-tasks.md +100 -0
  11. data/.claude/commands/tm/command-pipeline.md +80 -0
  12. data/.claude/commands/tm/complexity-report.md +120 -0
  13. data/.claude/commands/tm/convert-task-to-subtask.md +74 -0
  14. data/.claude/commands/tm/expand-all-tasks.md +52 -0
  15. data/.claude/commands/tm/expand-task.md +52 -0
  16. data/.claude/commands/tm/fix-dependencies.md +82 -0
  17. data/.claude/commands/tm/help.md +101 -0
  18. data/.claude/commands/tm/init-project-quick.md +49 -0
  19. data/.claude/commands/tm/init-project.md +53 -0
  20. data/.claude/commands/tm/install-taskmaster.md +118 -0
  21. data/.claude/commands/tm/learn.md +106 -0
  22. data/.claude/commands/tm/list-tasks-by-status.md +42 -0
  23. data/.claude/commands/tm/list-tasks-with-subtasks.md +30 -0
  24. data/.claude/commands/tm/list-tasks.md +46 -0
  25. data/.claude/commands/tm/next-task.md +69 -0
  26. data/.claude/commands/tm/parse-prd-with-research.md +51 -0
  27. data/.claude/commands/tm/parse-prd.md +52 -0
  28. data/.claude/commands/tm/project-status.md +67 -0
  29. data/.claude/commands/tm/quick-install-taskmaster.md +23 -0
  30. data/.claude/commands/tm/remove-all-subtasks.md +94 -0
  31. data/.claude/commands/tm/remove-dependency.md +65 -0
  32. data/.claude/commands/tm/remove-subtask.md +87 -0
  33. data/.claude/commands/tm/remove-subtasks.md +89 -0
  34. data/.claude/commands/tm/remove-task.md +110 -0
  35. data/.claude/commands/tm/setup-models.md +52 -0
  36. data/.claude/commands/tm/show-task.md +85 -0
  37. data/.claude/commands/tm/smart-workflow.md +58 -0
  38. data/.claude/commands/tm/sync-readme.md +120 -0
  39. data/.claude/commands/tm/tm-main.md +147 -0
  40. data/.claude/commands/tm/to-cancelled.md +58 -0
  41. data/.claude/commands/tm/to-deferred.md +50 -0
  42. data/.claude/commands/tm/to-done.md +47 -0
  43. data/.claude/commands/tm/to-in-progress.md +39 -0
  44. data/.claude/commands/tm/to-pending.md +35 -0
  45. data/.claude/commands/tm/to-review.md +43 -0
  46. data/.claude/commands/tm/update-single-task.md +122 -0
  47. data/.claude/commands/tm/update-task.md +75 -0
  48. data/.claude/commands/tm/update-tasks-from-id.md +111 -0
  49. data/.claude/commands/tm/validate-dependencies.md +72 -0
  50. data/.claude/commands/tm/view-models.md +52 -0
  51. data/.env.example +12 -0
  52. data/.mcp.json +24 -0
  53. data/.taskmaster/CLAUDE.md +435 -0
  54. data/.taskmaster/config.json +44 -0
  55. data/.taskmaster/docs/prd.txt +2044 -0
  56. data/.taskmaster/state.json +6 -0
  57. data/.taskmaster/tasks/task_001.md +19 -0
  58. data/.taskmaster/tasks/task_002.md +19 -0
  59. data/.taskmaster/tasks/task_003.md +19 -0
  60. data/.taskmaster/tasks/task_004.md +19 -0
  61. data/.taskmaster/tasks/task_005.md +19 -0
  62. data/.taskmaster/tasks/task_006.md +19 -0
  63. data/.taskmaster/tasks/task_007.md +19 -0
  64. data/.taskmaster/tasks/task_008.md +19 -0
  65. data/.taskmaster/tasks/task_009.md +19 -0
  66. data/.taskmaster/tasks/task_010.md +19 -0
  67. data/.taskmaster/tasks/task_011.md +19 -0
  68. data/.taskmaster/tasks/task_012.md +19 -0
  69. data/.taskmaster/tasks/task_013.md +19 -0
  70. data/.taskmaster/tasks/task_014.md +19 -0
  71. data/.taskmaster/tasks/task_015.md +19 -0
  72. data/.taskmaster/tasks/task_016.md +19 -0
  73. data/.taskmaster/tasks/task_017.md +19 -0
  74. data/.taskmaster/tasks/task_018.md +19 -0
  75. data/.taskmaster/tasks/task_019.md +19 -0
  76. data/.taskmaster/tasks/task_020.md +19 -0
  77. data/.taskmaster/tasks/tasks.json +299 -0
  78. data/.taskmaster/templates/example_prd.txt +47 -0
  79. data/.taskmaster/templates/example_prd_rpg.txt +511 -0
  80. data/CHANGELOG.md +29 -0
  81. data/CLAUDE.md +65 -0
  82. data/CODE_OF_CONDUCT.md +10 -0
  83. data/CONTRIBUTING.md +49 -0
  84. data/LICENSE +21 -0
  85. data/README.md +463 -0
  86. data/Rakefile +12 -0
  87. data/app/assets/stylesheets/data_porter/application.css +646 -0
  88. data/app/channels/data_porter/import_channel.rb +10 -0
  89. data/app/controllers/data_porter/imports_controller.rb +68 -0
  90. data/app/javascript/data_porter/progress_controller.js +33 -0
  91. data/app/jobs/data_porter/dry_run_job.rb +12 -0
  92. data/app/jobs/data_porter/import_job.rb +12 -0
  93. data/app/jobs/data_porter/parse_job.rb +12 -0
  94. data/app/models/data_porter/data_import.rb +49 -0
  95. data/app/views/data_porter/imports/index.html.erb +142 -0
  96. data/app/views/data_porter/imports/new.html.erb +88 -0
  97. data/app/views/data_porter/imports/show.html.erb +49 -0
  98. data/config/database.yml +3 -0
  99. data/config/routes.rb +12 -0
  100. data/docs/SPEC.md +2012 -0
  101. data/docs/UI.md +32 -0
  102. data/docs/blog/001-why-build-a-data-import-engine.md +166 -0
  103. data/docs/blog/002-scaffolding-a-rails-engine.md +188 -0
  104. data/docs/blog/003-configuration-dsl.md +222 -0
  105. data/docs/blog/004-store-model-jsonb.md +237 -0
  106. data/docs/blog/005-target-dsl.md +284 -0
  107. data/docs/blog/006-parsing-csv-sources.md +300 -0
  108. data/docs/blog/007-orchestrator.md +247 -0
  109. data/docs/blog/008-actioncable-stimulus.md +376 -0
  110. data/docs/blog/009-phlex-ui-components.md +446 -0
  111. data/docs/blog/010-controllers-routing.md +374 -0
  112. data/docs/blog/011-generators.md +364 -0
  113. data/docs/blog/012-json-api-sources.md +323 -0
  114. data/docs/blog/013-testing-rails-engine.md +618 -0
  115. data/docs/blog/014-dry-run.md +307 -0
  116. data/docs/blog/015-publishing-retro.md +264 -0
  117. data/docs/blog/016-erb-view-templates.md +431 -0
  118. data/docs/blog/017-showcase-final-retro.md +220 -0
  119. data/docs/blog/BACKLOG.md +8 -0
  120. data/docs/blog/SERIES.md +154 -0
  121. data/docs/screenshots/index-with-previewing.jpg +0 -0
  122. data/docs/screenshots/index.jpg +0 -0
  123. data/docs/screenshots/modal-new-import.jpg +0 -0
  124. data/docs/screenshots/preview.jpg +0 -0
  125. data/lib/data_porter/broadcaster.rb +29 -0
  126. data/lib/data_porter/components/base.rb +10 -0
  127. data/lib/data_porter/components/failure_alert.rb +20 -0
  128. data/lib/data_porter/components/preview_table.rb +54 -0
  129. data/lib/data_porter/components/progress_bar.rb +33 -0
  130. data/lib/data_porter/components/results_summary.rb +19 -0
  131. data/lib/data_porter/components/status_badge.rb +16 -0
  132. data/lib/data_porter/components/summary_cards.rb +30 -0
  133. data/lib/data_porter/components.rb +14 -0
  134. data/lib/data_porter/configuration.rb +25 -0
  135. data/lib/data_porter/dsl/api_config.rb +25 -0
  136. data/lib/data_porter/dsl/column.rb +17 -0
  137. data/lib/data_porter/engine.rb +15 -0
  138. data/lib/data_porter/orchestrator.rb +141 -0
  139. data/lib/data_porter/record_validator.rb +32 -0
  140. data/lib/data_porter/registry.rb +33 -0
  141. data/lib/data_porter/sources/api.rb +49 -0
  142. data/lib/data_porter/sources/base.rb +35 -0
  143. data/lib/data_porter/sources/csv.rb +43 -0
  144. data/lib/data_porter/sources/json.rb +45 -0
  145. data/lib/data_porter/sources.rb +20 -0
  146. data/lib/data_porter/store_models/error.rb +13 -0
  147. data/lib/data_porter/store_models/import_record.rb +52 -0
  148. data/lib/data_porter/store_models/report.rb +21 -0
  149. data/lib/data_porter/target.rb +89 -0
  150. data/lib/data_porter/type_validator.rb +46 -0
  151. data/lib/data_porter/version.rb +5 -0
  152. data/lib/data_porter.rb +32 -0
  153. data/lib/generators/data_porter/install/install_generator.rb +33 -0
  154. data/lib/generators/data_porter/install/templates/create_data_porter_imports.rb.erb +21 -0
  155. data/lib/generators/data_porter/install/templates/initializer.rb +30 -0
  156. data/lib/generators/data_porter/target/target_generator.rb +44 -0
  157. data/lib/generators/data_porter/target/templates/target.rb.tt +20 -0
  158. data/sig/data_porter.rbs +4 -0
  159. metadata +274 -0
data/docs/UI.md ADDED
@@ -0,0 +1,32 @@
1
+ #### UI UX
2
+
3
+ - Phlex
4
+ - Tailwind
5
+
6
+ ```json
7
+ // tailwind.config.js (dans la gem, au build)
8
+ module.exports = {
9
+ prefix: 'dp-',
10
+ important: '.data-porter',
11
+ content: [
12
+ './app/views/data_porter/**/*.erb',
13
+ './lib/data_porter/components/**/*.rb',
14
+ './app/javascript/data_porter/**/*.js'
15
+ ],
16
+ corePlugins: {
17
+ preflight: false // pas de reset global, on ne touche pas au host
18
+ },
19
+ theme: {
20
+ extend: {
21
+ colors: {
22
+ complete: 'var(--dp-color-complete, #16a34a)',
23
+ partial: 'var(--dp-color-partial, #ca8a04)',
24
+ missing: 'var(--dp-color-missing, #dc2626)',
25
+ primary: 'var(--dp-color-primary, #6366f1)',
26
+ }
27
+ }
28
+ }
29
+ }
30
+ ```
31
+
32
+ ---
@@ -0,0 +1,166 @@
1
+ ---
2
+ title: "Building DataPorter #1 — Why build a data import engine?"
3
+ series: "Building DataPorter - A Data Import Engine for Rails"
4
+ part: 1
5
+ tags: [ruby, rails, rails-engine, gem-development, architecture, open-source]
6
+ published: false
7
+ ---
8
+
9
+ # Why build a data import engine?
10
+
11
+ > Every Rails app eventually needs to import data. Let's stop rewriting the same workflow every time.
12
+
13
+ ## Context
14
+
15
+ This is the first article in a series where we build **DataPorter**, a mountable Rails engine for data import workflows, from scratch. We'll go from `bundle gem` to a published rubygem, covering architecture decisions, DSL design, testing strategies, and everything in between.
16
+
17
+ By the end of this series, you'll have a deep understanding of how to build a production-ready Rails engine — and a reusable gem to show for it.
18
+
19
+ ## The problem
20
+
21
+ If you've worked on any non-trivial Rails application, you've probably written this code more than once:
22
+
23
+ 1. Upload a CSV (or fetch data from an API)
24
+ 2. Parse and validate each row
25
+ 3. Show the user what's about to be imported
26
+ 4. Persist the valid records to the database
27
+
28
+ Maybe it was a guest list for a hotel app. Maybe vendor data for an e-commerce platform. Maybe scraped listings from an external API.
29
+
30
+ The specifics change, but the workflow is always the same. And every time, we rebuild it from scratch: a controller action here, some CSV parsing there, a background job, maybe a progress bar if we're feeling fancy.
31
+
32
+ The result? Scattered import logic across controllers, services, and jobs. No consistency. No reuse. Every new import type means rewriting the same infrastructure.
33
+
34
+ ## What we're building
35
+
36
+ DataPorter is a mountable Rails engine that provides the entire import infrastructure. The host app only defines the business part: *what* to import and *how* to persist it.
37
+
38
+ One file, one class, one import type:
39
+
40
+ ```ruby
41
+ # app/importers/guests_target.rb
42
+ class GuestsTarget < DataPorter::Target
43
+ label "Guests"
44
+ model Guest
45
+ sources :csv, :json
46
+
47
+ columns do
48
+ column :first_name, type: :string, required: true
49
+ column :last_name, type: :string, required: true
50
+ column :email, type: :email
51
+ column :phone, type: :phone
52
+ end
53
+
54
+ def persist(record, context:)
55
+ Guest.create!(hotel: context.hotel, **record.attributes)
56
+ end
57
+ end
58
+ ```
59
+
60
+ That's it. DataPorter handles the rest: file upload, parsing, validation, preview UI, progress tracking, error reporting, and background processing.
61
+
62
+ The workflow is always three steps:
63
+
64
+ ```
65
+ Upload / Configure → Preview & Validate → Import
66
+ ```
67
+
68
+ ## Why not use what already exists?
69
+
70
+ There are existing solutions in the Rails ecosystem. Let's look at the two most common approaches.
71
+
72
+ ### The DIY approach
73
+
74
+ Most teams build custom import flows per model. It works, but it doesn't scale. By the third import type, you're copy-pasting controller actions and wishing you had abstracted earlier.
75
+
76
+ ### maintenance_tasks
77
+
78
+ Shopify's [maintenance_tasks](https://github.com/Shopify/maintenance_tasks) gem is excellent for one-off data processing scripts. It provides a UI, background processing, and CSV support.
79
+
80
+ But it solves a different problem. It's designed for fire-and-forget maintenance operations, not interactive import workflows.
81
+
82
+ | Aspect | maintenance_tasks | DataPorter |
83
+ |--------|-------------------|------------|
84
+ | Purpose | One-off scripts | Import workflows |
85
+ | Preview before import | No | Yes |
86
+ | Visual validation | No | Yes (complete/partial/missing) |
87
+ | Multi-step workflow | No (fire & forget) | Yes (parse -> preview -> import) |
88
+ | Real-time progress | No | Yes (ActionCable) |
89
+ | Data sources | CSV, ActiveRecord | CSV, JSON, API (extensible) |
90
+ | Auto-generated UI | Parameter form | Dynamic column table |
91
+
92
+ The key difference: DataPorter adds a **human validation step** between parsing and persisting. The user sees exactly what will be imported, with clear status indicators for each row, before anything touches the database.
93
+
94
+ ## Architecture overview
95
+
96
+ DataPorter is split into two clear layers:
97
+
98
+ ```
99
+ ┌─────────────────────────────────────┐
100
+ │ DataPorter (the gem) │
101
+ │ │
102
+ │ Engine, Model, State Machine, │
103
+ │ Sources, Orchestrator, Jobs, │
104
+ │ ActionCable, UI, DSL, Registry, │
105
+ │ Generators │
106
+ └──────────────┬──────────────────────┘
107
+ │ mount + configure + define targets
108
+ ┌──────────────┴──────────────────────┐
109
+ │ Host App │
110
+ │ │
111
+ │ Initializer, Target files, │
112
+ │ Auth (parent controller), │
113
+ │ Style overrides (optional) │
114
+ └─────────────────────────────────────┘
115
+ ```
116
+
117
+ The gem owns the infrastructure. The host app owns the business logic. This separation is the core design principle we'll follow throughout the series.
118
+
119
+ ## The tech stack
120
+
121
+ Here's what we'll use and why:
122
+
123
+ | Dependency | Role | Why |
124
+ |------------|------|-----|
125
+ | **store_model** | Typed JSONB attributes | Store import records as structured data without extra tables |
126
+ | **phlex** | View components | Ruby-native views, easier to test and namespace than ERB |
127
+ | **turbo-rails** | Page updates | Turbo Frames for partial reloads during the import flow |
128
+ | **stimulus** | JS behavior | Progress bar updates via ActionCable |
129
+ | **Tailwind CSS** | Styling | Scoped with `dp-` prefix to avoid host app conflicts |
130
+
131
+ We'll also rely heavily on Rails built-ins: ActiveJob for background processing, ActionCable for real-time updates, ActiveStorage for file uploads, and enum-based state machine for the import lifecycle.
132
+
133
+ ## What this series covers
134
+
135
+ Here's the roadmap — each part is a standalone article:
136
+
137
+ 1. **Why build a data import engine?** (this article)
138
+ 2. **Scaffolding a Rails Engine gem** — gem structure, Engine setup
139
+ 3. **Configuration DSL** — making the gem flexible
140
+ 4. **StoreModel & JSONB** — modeling import data without extra tables
141
+ 5. **Target DSL** — one file = one import type
142
+ 6. **CSV parsing with Sources** — the first end-to-end flow
143
+ 7. **The Orchestrator** — coordinating parse and import
144
+ 8. **ActionCable & Stimulus** — real-time progress
145
+ 9. **Phlex & Tailwind UI** — auto-generated preview tables
146
+ 10. **Controllers & routing** — engine controllers done right
147
+ 11. **Generators** — install in one command
148
+ 12. **JSON & API sources** — beyond CSV
149
+ 13. **Testing a Rails Engine** — specs for an isolated engine
150
+ 14. **Dry Run mode** — validate against the database before importing
151
+ 15. **Publishing & retrospective** — from repo to rubygems.org
152
+
153
+ ## Recap
154
+
155
+ - Data import is a recurring pattern in Rails apps that deserves a reusable solution
156
+ - DataPorter provides the infrastructure (upload, parse, preview, import) while the host app defines the business logic
157
+ - The 3-step workflow with human validation is what sets it apart from existing tools
158
+ - We're building a proper mountable Rails engine with a clean separation between gem and host app
159
+
160
+ ## Next up
161
+
162
+ In the next article, we'll run `bundle gem data_porter`, set up the Rails Engine with `isolate_namespace`, structure our directories, and configure the gemspec with our dependencies. We'll make our first architectural decisions — and explain why they matter.
163
+
164
+ ---
165
+
166
+ *This is part 1 of the series "Building DataPorter - A Data Import Engine for Rails". [Next: Scaffolding a Rails Engine gem](#)*
@@ -0,0 +1,188 @@
1
+ ---
2
+ title: "Building DataPorter #2 — Scaffolding a Rails Engine gem"
3
+ series: "Building DataPorter - A Data Import Engine for Rails"
4
+ part: 2
5
+ tags: [ruby, rails, rails-engine, gem-development, architecture]
6
+ published: false
7
+ ---
8
+
9
+ # Scaffolding a Rails Engine gem
10
+
11
+ > Running `bundle gem` is just the beginning. Here's how to turn a blank gem into a proper Rails Engine with isolated namespacing, auto-loading, and a clean dependency story.
12
+
13
+ ## Context
14
+
15
+ In [Part 1](/docs/blog/001-why-build-a-data-import-engine.md), we defined the problem DataPorter solves: a reusable, multi-step import workflow for Rails apps. We talked about architecture, the tech stack, and what sets this gem apart from existing solutions.
16
+
17
+ Now it's time to write code. In this article, we'll scaffold the gem, wire up the Rails Engine, and make our first real decisions about directory structure, namespacing, and dependencies.
18
+
19
+ ## The problem
20
+
21
+ `bundle gem data_porter` gives you a perfectly valid Ruby gem. But it gives you a Ruby gem, not a Rails Engine. There's no `config/routes.rb`, no autoloading of models or controllers, no way for a host app to mount your gem at a path. A raw gem doesn't know anything about Rails.
22
+
23
+ Turning that skeleton into a mountable engine means answering a few questions upfront: How do we isolate our namespace so we don't collide with the host app? How do we structure directories so Rails autoloading works inside the gem? Which dependencies do we declare, and how tightly do we pin them?
24
+
25
+ Get these wrong and you'll be fighting Rails conventions for the rest of the project. Get them right and everything else just works.
26
+
27
+ ## What we're building
28
+
29
+ By the end of this article, the gem will have a working Engine, an isolated namespace, auto-discovery of host app import targets, and a clean gemspec. Here's the directory tree we're aiming for:
30
+
31
+ ```
32
+ data_porter/
33
+ config/
34
+ routes.rb # Engine routes (empty for now)
35
+ lib/
36
+ data_porter.rb # Entry point: requires + module-level API
37
+ data_porter/
38
+ version.rb # Version constant
39
+ engine.rb # The Rails Engine
40
+ configuration.rb # Config object (teased here, detailed in Part 3)
41
+ spec/
42
+ data_porter/
43
+ engine_spec.rb # Engine specs
44
+ data_porter.gemspec # Dependencies and metadata
45
+ ```
46
+
47
+ This is deliberately minimal. We'll add `app/models`, `app/controllers`, and `app/views` directories as we build those layers in later articles. Rails Engine autoloading picks them up automatically once they exist, so there's no reason to create empty folders now.
48
+
49
+ ## Implementation
50
+
51
+ ### Step 1 -- The Engine class
52
+
53
+ The Engine is the single most important file in a Rails Engine gem. It's the bridge between your gem and the host application's Rails stack.
54
+
55
+ ```ruby
56
+ # lib/data_porter/engine.rb
57
+ module DataPorter
58
+ class Engine < ::Rails::Engine
59
+ isolate_namespace DataPorter
60
+
61
+ config.to_prepare do
62
+ Dir[Rails.root.join("app/importers/*_target.rb")].each { |f| require f }
63
+ end
64
+ end
65
+ end
66
+ ```
67
+
68
+ Three things happen in these few lines, and each one matters.
69
+
70
+ First, we inherit from `::Rails::Engine`. The leading `::` ensures we reference the top-level `Rails` module, not anything that might be nested under `DataPorter`. This is the line that tells Rails "this gem provides engine functionality" -- routes, autoloading, migrations, the works.
71
+
72
+ Second, `isolate_namespace DataPorter` is the key architectural decision. It tells Rails to scope everything the engine provides -- routes, models, controllers -- under the `DataPorter` module. Without it, a `DataImport` model in the engine would collide with a `DataImport` model in the host app. Isolation means our table names become `data_porter_data_imports`, our routes get prefixed, and the host app stays clean.
73
+
74
+ Third, the `config.to_prepare` block handles auto-discovery. We want host apps to define import targets in `app/importers/` using a simple naming convention: `*_target.rb`. The `to_prepare` callback runs before each request in development (so changes are picked up without a restart) and once at boot in production. This is the standard Rails pattern for loading code that needs to exist before the app serves requests.
75
+
76
+ ### Step 2 -- The entry point
77
+
78
+ Every gem needs a single file that bootstraps everything. Ours is `lib/data_porter.rb`, and it sets the order of operations.
79
+
80
+ ```ruby
81
+ # lib/data_porter.rb
82
+ require "rails/engine"
83
+ require_relative "data_porter/version"
84
+ require_relative "data_porter/configuration"
85
+ require_relative "data_porter/engine"
86
+
87
+ module DataPorter
88
+ class Error < StandardError; end
89
+
90
+ def self.configuration
91
+ @configuration ||= Configuration.new
92
+ end
93
+
94
+ def self.configure
95
+ yield(configuration)
96
+ end
97
+ end
98
+ ```
99
+
100
+ The require order is intentional. We load `rails/engine` first because `DataPorter::Engine` inherits from it. Then the version (needed by the gemspec), then configuration (needed by the module-level API), then the engine itself.
101
+
102
+ The `DataPorter` module also defines the gem's public API surface. `DataPorter.configure` with a block is the pattern host apps will use to customize behavior. `DataPorter::Error` gives us a base exception class to inherit from in later parts. We'll dive deep into the configuration DSL in Part 3 -- for now, the important thing is that the scaffolding is in place.
103
+
104
+ Notice that we use `require_relative` for our own files and `require` for external dependencies. This is a minor but useful convention: `require_relative` is resolved from the current file's location, so it works regardless of how the gem is loaded. `require` goes through the Ruby load path, which is correct for third-party gems.
105
+
106
+ ### Step 3 -- The gemspec
107
+
108
+ The gemspec declares what the gem needs to run and how it should be packaged. Let's look at the dependency section, which is where the real decisions live.
109
+
110
+ ```ruby
111
+ # data_porter.gemspec
112
+ Gem::Specification.new do |spec|
113
+ spec.name = "data_porter"
114
+ spec.version = DataPorter::VERSION
115
+ spec.required_ruby_version = ">= 3.2.0"
116
+
117
+ # ...metadata...
118
+
119
+ spec.add_dependency "rails", ">= 7.0"
120
+ spec.add_dependency "store_model", ">= 2.0"
121
+ spec.add_dependency "phlex", ">= 1.0"
122
+ spec.add_dependency "turbo-rails", ">= 1.0"
123
+ end
124
+ ```
125
+
126
+ Four runtime dependencies, each chosen for a specific reason.
127
+
128
+ **rails >= 7.0** is the floor. We use features like `enum` improvements and `config.to_prepare` patterns that are stable from Rails 7 onward. No upper bound -- we trust semver and test against multiple Rails versions in CI.
129
+
130
+ **store_model >= 2.0** lets us define typed, validatable Ruby objects backed by JSONB columns. We'll use it to model import records, errors, and reports without creating extra database tables. Part 4 covers this in detail.
131
+
132
+ **phlex >= 1.0** gives us Ruby-native view components. We chose it over ViewComponent because Phlex components are plain Ruby classes -- easier to namespace, easier to test, no template files to manage inside a gem.
133
+
134
+ **turbo-rails >= 1.0** provides Turbo Frames and Turbo Streams. The import workflow is a natural fit for Turbo: upload in one frame, preview in another, progress updates via streams.
135
+
136
+ All version constraints use `>=` with a floor, not `~>` with a pessimistic lock. This is deliberate for a library gem. A host app should control the exact versions through its `Gemfile.lock`. If we locked `phlex ~> 1.0`, we'd block any host app that wants to use Phlex 2.x. The trade-off is that we need CI coverage across versions, but that's a solvable problem (and we'll set that up in a later article).
137
+
138
+ Also worth noting: the `files` array in the gemspec uses `git ls-files` and explicitly excludes test files, bin scripts, and CI config. The published gem should only include what's needed to run it.
139
+
140
+ ## Decisions and tradeoffs
141
+
142
+ | Decision | We chose | Over | Because |
143
+ |----------|----------|------|---------|
144
+ | Namespace isolation | `isolate_namespace` | Full engine (no isolation) | Prevents model/route collisions with host app |
145
+ | Version constraints | `>= floor` (optimistic) | `~> x.y` (pessimistic) | Library gems should not lock transitive dependencies tightly |
146
+ | Target auto-discovery | `config.to_prepare` with Dir glob | Explicit registration API | Convention over configuration; zero boilerplate for host apps |
147
+ | View layer | Phlex | ERB / ViewComponent | Pure Ruby classes are easier to namespace and test inside a gem |
148
+ | Require strategy | `require_relative` for internal files | `require` for everything | Resolves from file location, works regardless of load path state |
149
+
150
+ ## Testing it
151
+
152
+ We can verify the engine is wired up correctly with a focused spec.
153
+
154
+ ```ruby
155
+ # spec/data_porter/engine_spec.rb
156
+ RSpec.describe DataPorter::Engine do
157
+ it "is a Rails::Engine" do
158
+ expect(described_class.superclass).to eq(Rails::Engine)
159
+ end
160
+
161
+ it "has an isolated namespace" do
162
+ expect(described_class.isolated?).to be true
163
+ end
164
+
165
+ it "is namespaced under DataPorter" do
166
+ expect(described_class.engine_name).to eq("data_porter")
167
+ end
168
+ end
169
+ ```
170
+
171
+ These three assertions confirm the fundamentals: the class hierarchy is correct, isolation is active, and the engine name Rails derives matches what we expect. The `isolated?` check is particularly important -- if someone accidentally removes the `isolate_namespace` line, this test catches it immediately.
172
+
173
+ ## Recap
174
+
175
+ - A Rails Engine is what turns a Ruby gem into something a Rails app can mount, with routes, autoloading, and migrations.
176
+ - `isolate_namespace` prevents name collisions between the engine and the host app. It's a one-liner, but it shapes the entire architecture.
177
+ - The entry point (`lib/data_porter.rb`) sets up the require order and defines the gem's public API surface.
178
+ - Gemspec dependencies should use optimistic version constraints (`>=`) so host apps stay in control of their dependency tree.
179
+ - The `config.to_prepare` callback gives us zero-config auto-discovery of import targets in the host app.
180
+
181
+ ## Next up
182
+
183
+ We have a working engine, but it's not configurable yet. In Part 3, we'll build the configuration DSL -- the `DataPorter.configure` block that lets host apps customize parent controllers, queue names, storage backends, and more. We'll look at why a dedicated configuration object beats scattering settings across `Rails.application.config`, and how to design defaults that make sense out of the box.
184
+
185
+ ---
186
+
187
+ *This is part 2 of the series "Building DataPorter - A Data Import Engine for Rails". [Previous: Why build a data import engine?](/docs/blog/001-why-build-a-data-import-engine.md) | [Next: Configuration DSL](#)*
188
+ *Code: [GitHub](https://github.com/SerylLns/data_porter)*
@@ -0,0 +1,222 @@
1
+ ---
2
+ title: "Building DataPorter #3 — Configuration DSL: making the gem flexible"
3
+ series: "Building DataPorter - A Data Import Engine for Rails"
4
+ part: 3
5
+ tags: [ruby, rails, rails-engine, gem-development, dsl, configuration, singleton-pattern]
6
+ published: false
7
+ ---
8
+
9
+ # Configuration DSL: making the gem flexible
10
+
11
+ > A gem that can't adapt to its host app will never leave your own repo.
12
+
13
+ ## Context
14
+
15
+ This is part 3 of the series where we build **DataPorter**, a mountable Rails engine for data import workflows. In [part 1](#), we established the problem and architecture. Part 2 covered scaffolding the engine gem with `isolate_namespace`.
16
+
17
+ In this article, we'll build the configuration layer: a clean DSL that lets host apps customize DataPorter's behavior through an initializer. By the end, you'll have a `DataPorter.configure` block that feels like any well-designed Rails gem.
18
+
19
+ ## The problem
20
+
21
+ Our engine needs to run inside apps we don't control. One app uses Sidekiq with a custom queue name. Another stores files on S3. A third needs every import scoped to the current hotel's account.
22
+
23
+ Hard-coding any of these choices inside the gem would make it useless to anyone whose setup differs from ours. But making *everything* configurable turns the gem into a configuration puzzle where nobody remembers what goes where.
24
+
25
+ The challenge is finding the line between flexibility and convention -- and expressing it through an API that feels obvious on first read.
26
+
27
+ ## What we're building
28
+
29
+ Here's the end result from the host app's perspective:
30
+
31
+ ```ruby
32
+ # config/initializers/data_porter.rb
33
+ DataPorter.configure do |config|
34
+ config.parent_controller = "Admin::BaseController"
35
+ config.queue_name = :low_priority
36
+ config.storage_service = :amazon
37
+ config.preview_limit = 200
38
+
39
+ config.context_builder = ->(controller) {
40
+ { hotel: controller.current_hotel, user: controller.current_user }
41
+ }
42
+ end
43
+ ```
44
+
45
+ This reads exactly like a Devise or Sidekiq initializer. A `configure` block yields a plain object with sensible defaults. If you don't call `configure` at all, everything still works.
46
+
47
+ ## Implementation
48
+
49
+ ### Step 1 -- The Configuration class
50
+
51
+ We need an object that holds every configurable value and provides reasonable defaults out of the box. The simplest approach that works: a plain Ruby class with `attr_accessor` and defaults set in `initialize`.
52
+
53
+ ```ruby
54
+ # lib/data_porter/configuration.rb
55
+ module DataPorter
56
+ class Configuration
57
+ attr_accessor :parent_controller,
58
+ :queue_name,
59
+ :storage_service,
60
+ :cable_channel_prefix,
61
+ :context_builder,
62
+ :preview_limit,
63
+ :enabled_sources,
64
+ :scope
65
+
66
+ def initialize
67
+ @parent_controller = "ApplicationController"
68
+ @queue_name = :imports
69
+ @storage_service = :local
70
+ @cable_channel_prefix = "data_porter"
71
+ @context_builder = nil
72
+ @preview_limit = 500
73
+ @enabled_sources = %i[csv json api]
74
+ @scope = nil
75
+ end
76
+ end
77
+ end
78
+ ```
79
+
80
+ Every attribute has a default that makes the gem work without any initializer. `parent_controller` defaults to `"ApplicationController"` because that exists in every Rails app. `storage_service` defaults to `:local` because that requires zero setup. `preview_limit` caps at 500 rows to keep the preview page responsive.
81
+
82
+ Two attributes default to `nil` on purpose: `context_builder` and `scope`. These are opt-in features. When `context_builder` is nil, imports run without host-specific context. When `scope` is nil, the engine shows all imports. The gem checks for nil and adapts its behavior, rather than forcing a value that might be wrong.
83
+
84
+ Let's walk through the attributes that deserve a closer look.
85
+
86
+ **`parent_controller`** is a string, not a class. That's deliberate. At configuration time (during boot), the host app's controller class might not be loaded yet. We store the string and `constantize` it later, when Rails actually needs to resolve the inheritance chain.
87
+
88
+ **`context_builder`** is the most interesting one. It's a lambda that receives the current controller instance and returns whatever the host app needs during import. This is how a multi-tenant app passes `current_hotel` into the import flow without the gem knowing anything about hotels. We'll use this extensively when we build the Orchestrator in part 7.
89
+
90
+ **`enabled_sources`** lets the host app restrict which source types appear in the UI. If you only deal with CSV files, you can set `enabled_sources = %i[csv]` and the JSON/API options won't clutter the interface.
91
+
92
+ ### Step 2 -- The module-level DSL
93
+
94
+ The Configuration class is just a data object. We need two module-level methods to turn it into a DSL: one to access the singleton instance, and one to yield it for configuration.
95
+
96
+ ```ruby
97
+ # lib/data_porter.rb
98
+ module DataPorter
99
+ class Error < StandardError; end
100
+
101
+ def self.configuration
102
+ @configuration ||= Configuration.new
103
+ end
104
+
105
+ def self.configure
106
+ yield(configuration)
107
+ end
108
+ end
109
+ ```
110
+
111
+ `configuration` uses memoization via `||=` to ensure a single instance across the application. The first call creates a `Configuration.new` with all defaults; subsequent calls return the same object. This is the singleton pattern without the ceremony of the `Singleton` module.
112
+
113
+ `configure` yields that singleton to a block. Since `attr_accessor` creates both reader and writer methods, the block can set any attribute directly. After the block runs, `DataPorter.configuration.queue_name` returns whatever the host app set -- or the default if they didn't touch it.
114
+
115
+ There's no `reset!` method in production code. We don't need one. The configuration is set once during Rails boot and stays put for the process lifetime. We do need to reset between tests, but that's handled with `instance_variable_set` in the spec (we'll see that in a moment).
116
+
117
+ ### Step 3 -- Wiring it up on boot
118
+
119
+ The configuration module gets required early, before the engine loads. This ensures `DataPorter.configure` is available when the host app's initializer runs.
120
+
121
+ ```ruby
122
+ # lib/data_porter.rb (top of file)
123
+ require "rails/engine"
124
+ require_relative "data_porter/version"
125
+ require_relative "data_porter/configuration"
126
+ require_relative "data_porter/engine"
127
+ ```
128
+
129
+ The load order matters. `configuration.rb` comes before `engine.rb` because the Engine class might reference configuration values during setup. In practice, Rails processes initializers after the engine is loaded, so the host app's `configure` block runs with the full gem already available.
130
+
131
+ Other parts of the gem read configuration like this:
132
+
133
+ ```ruby
134
+ # Inside any DataPorter class
135
+ DataPorter.configuration.queue_name
136
+ DataPorter.configuration.context_builder&.call(controller)
137
+ ```
138
+
139
+ The safe navigation operator (`&.`) on `context_builder` handles the nil default gracefully. When no builder is configured, the call simply returns nil instead of raising a NoMethodError.
140
+
141
+ ## Decisions & tradeoffs
142
+
143
+ | Decision | We chose | Over | Because |
144
+ |----------|----------|------|---------|
145
+ | Configuration object | Plain class with `attr_accessor` | `OpenStruct`, `Dry::Configurable` | No dependencies, easy to read, easy to document; IDE autocompletion works with real attributes |
146
+ | Singleton pattern | Memoized module instance variable | `Singleton` module, `Rails.application.config` | Simpler API (`DataPorter.configure`), no coupling to Rails config namespace, works in non-Rails test contexts |
147
+ | `context_builder` type | Lambda | Block stored as proc, method object | Lambdas enforce arity (catches wrong argument count), and the syntax `->() {}` signals "this is a callable" to the reader |
148
+ | `parent_controller` type | String | Class constant | Avoids load-order issues; the class may not exist at configuration time, but the string can be `constantize`d later |
149
+ | Default for optional features | `nil` | Null object pattern | Simpler to check `if context_builder` than to create a no-op null object; the gem has few enough optional features to keep the nil checks manageable |
150
+
151
+ ## Testing it
152
+
153
+ The specs verify two things: that defaults are sane, and that the `configure` block actually mutates the singleton.
154
+
155
+ ```ruby
156
+ # spec/data_porter/configuration_spec.rb
157
+ RSpec.describe DataPorter::Configuration do
158
+ subject(:config) { described_class.new }
159
+
160
+ it "has default parent_controller" do
161
+ expect(config.parent_controller).to eq("ApplicationController")
162
+ end
163
+
164
+ it "has default queue_name" do
165
+ expect(config.queue_name).to eq(:imports)
166
+ end
167
+
168
+ it "has default preview_limit" do
169
+ expect(config.preview_limit).to eq(500)
170
+ end
171
+
172
+ it "has nil context_builder by default" do
173
+ expect(config.context_builder).to be_nil
174
+ end
175
+ end
176
+ ```
177
+
178
+ Notice that each default gets its own test. This is intentional. When someone changes a default six months from now, the failure message says exactly which default broke, not just "configuration test failed."
179
+
180
+ The module-level specs test the singleton behavior and the `configure` yield pattern:
181
+
182
+ ```ruby
183
+ # spec/data_porter/configuration_spec.rb
184
+ RSpec.describe DataPorter do
185
+ describe ".configure" do
186
+ after { DataPorter.instance_variable_set(:@configuration, nil) }
187
+
188
+ it "yields the configuration" do
189
+ DataPorter.configure do |config|
190
+ config.queue_name = :custom_queue
191
+ end
192
+
193
+ expect(DataPorter.configuration.queue_name).to eq(:custom_queue)
194
+ end
195
+ end
196
+
197
+ describe ".configuration" do
198
+ after { DataPorter.instance_variable_set(:@configuration, nil) }
199
+
200
+ it "memoizes the configuration" do
201
+ expect(DataPorter.configuration).to be(DataPorter.configuration)
202
+ end
203
+ end
204
+ end
205
+ ```
206
+
207
+ The `after` block resets the singleton between tests using `instance_variable_set`. This is the one place where we reach into internals, and it's acceptable because test isolation trumps encapsulation here. A public `reset!` method would leak test concerns into production code.
208
+
209
+ ## Recap
210
+
211
+ - The `Configuration` class is a plain Ruby object with `attr_accessor` and defaults in `initialize`. No framework magic, no dependencies.
212
+ - Two module-level methods (`configure` and `configuration`) create the DSL that host apps use in their initializer.
213
+ - Sensible defaults mean the gem works with zero configuration. `context_builder` and `scope` are opt-in via nil defaults.
214
+ - Storing `parent_controller` as a string avoids boot-order issues. Using a lambda for `context_builder` enforces arity and reads clearly.
215
+
216
+ ## Next up
217
+
218
+ Configuration tells the gem *how* to behave. In part 4, we'll tackle *what* it operates on: the data models. We'll use StoreModel and JSONB columns to store import records, validation errors, and summary reports as structured data inside a single table -- no migration per import type, no schema sprawl. If you've ever debated "extra table vs. JSON column," that's the one to read.
219
+
220
+ ---
221
+
222
+ *This is part 3 of the series "Building DataPorter - A Data Import Engine for Rails". [Previous: Scaffolding a Rails Engine gem](#) | [Next: Modeling import data with StoreModel & JSONB](#)*