RubyGems - data_porter - Versions diffs - 0.1.0 - Mend

data_porter 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (159) hide show

checksums.yaml +7 -0
data/.claude/commands/blog-status.md +10 -0
data/.claude/commands/blog.md +109 -0
data/.claude/commands/task-done.md +27 -0
data/.claude/commands/tm/add-dependency.md +58 -0
data/.claude/commands/tm/add-subtask.md +79 -0
data/.claude/commands/tm/add-task.md +81 -0
data/.claude/commands/tm/analyze-complexity.md +124 -0
data/.claude/commands/tm/analyze-project.md +100 -0
data/.claude/commands/tm/auto-implement-tasks.md +100 -0
data/.claude/commands/tm/command-pipeline.md +80 -0
data/.claude/commands/tm/complexity-report.md +120 -0
data/.claude/commands/tm/convert-task-to-subtask.md +74 -0
data/.claude/commands/tm/expand-all-tasks.md +52 -0
data/.claude/commands/tm/expand-task.md +52 -0
data/.claude/commands/tm/fix-dependencies.md +82 -0
data/.claude/commands/tm/help.md +101 -0
data/.claude/commands/tm/init-project-quick.md +49 -0
data/.claude/commands/tm/init-project.md +53 -0
data/.claude/commands/tm/install-taskmaster.md +118 -0
data/.claude/commands/tm/learn.md +106 -0
data/.claude/commands/tm/list-tasks-by-status.md +42 -0
data/.claude/commands/tm/list-tasks-with-subtasks.md +30 -0
data/.claude/commands/tm/list-tasks.md +46 -0
data/.claude/commands/tm/next-task.md +69 -0
data/.claude/commands/tm/parse-prd-with-research.md +51 -0
data/.claude/commands/tm/parse-prd.md +52 -0
data/.claude/commands/tm/project-status.md +67 -0
data/.claude/commands/tm/quick-install-taskmaster.md +23 -0
data/.claude/commands/tm/remove-all-subtasks.md +94 -0
data/.claude/commands/tm/remove-dependency.md +65 -0
data/.claude/commands/tm/remove-subtask.md +87 -0
data/.claude/commands/tm/remove-subtasks.md +89 -0
data/.claude/commands/tm/remove-task.md +110 -0
data/.claude/commands/tm/setup-models.md +52 -0
data/.claude/commands/tm/show-task.md +85 -0
data/.claude/commands/tm/smart-workflow.md +58 -0
data/.claude/commands/tm/sync-readme.md +120 -0
data/.claude/commands/tm/tm-main.md +147 -0
data/.claude/commands/tm/to-cancelled.md +58 -0
data/.claude/commands/tm/to-deferred.md +50 -0
data/.claude/commands/tm/to-done.md +47 -0
data/.claude/commands/tm/to-in-progress.md +39 -0
data/.claude/commands/tm/to-pending.md +35 -0
data/.claude/commands/tm/to-review.md +43 -0
data/.claude/commands/tm/update-single-task.md +122 -0
data/.claude/commands/tm/update-task.md +75 -0
data/.claude/commands/tm/update-tasks-from-id.md +111 -0
data/.claude/commands/tm/validate-dependencies.md +72 -0
data/.claude/commands/tm/view-models.md +52 -0
data/.env.example +12 -0
data/.mcp.json +24 -0
data/.taskmaster/CLAUDE.md +435 -0
data/.taskmaster/config.json +44 -0
data/.taskmaster/docs/prd.txt +2044 -0
data/.taskmaster/state.json +6 -0
data/.taskmaster/tasks/task_001.md +19 -0
data/.taskmaster/tasks/task_002.md +19 -0
data/.taskmaster/tasks/task_003.md +19 -0
data/.taskmaster/tasks/task_004.md +19 -0
data/.taskmaster/tasks/task_005.md +19 -0
data/.taskmaster/tasks/task_006.md +19 -0
data/.taskmaster/tasks/task_007.md +19 -0
data/.taskmaster/tasks/task_008.md +19 -0
data/.taskmaster/tasks/task_009.md +19 -0
data/.taskmaster/tasks/task_010.md +19 -0
data/.taskmaster/tasks/task_011.md +19 -0
data/.taskmaster/tasks/task_012.md +19 -0
data/.taskmaster/tasks/task_013.md +19 -0
data/.taskmaster/tasks/task_014.md +19 -0
data/.taskmaster/tasks/task_015.md +19 -0
data/.taskmaster/tasks/task_016.md +19 -0
data/.taskmaster/tasks/task_017.md +19 -0
data/.taskmaster/tasks/task_018.md +19 -0
data/.taskmaster/tasks/task_019.md +19 -0
data/.taskmaster/tasks/task_020.md +19 -0
data/.taskmaster/tasks/tasks.json +299 -0
data/.taskmaster/templates/example_prd.txt +47 -0
data/.taskmaster/templates/example_prd_rpg.txt +511 -0
data/CHANGELOG.md +29 -0
data/CLAUDE.md +65 -0
data/CODE_OF_CONDUCT.md +10 -0
data/CONTRIBUTING.md +49 -0
data/LICENSE +21 -0
data/README.md +463 -0
data/Rakefile +12 -0
data/app/assets/stylesheets/data_porter/application.css +646 -0
data/app/channels/data_porter/import_channel.rb +10 -0
data/app/controllers/data_porter/imports_controller.rb +68 -0
data/app/javascript/data_porter/progress_controller.js +33 -0
data/app/jobs/data_porter/dry_run_job.rb +12 -0
data/app/jobs/data_porter/import_job.rb +12 -0
data/app/jobs/data_porter/parse_job.rb +12 -0
data/app/models/data_porter/data_import.rb +49 -0
data/app/views/data_porter/imports/index.html.erb +142 -0
data/app/views/data_porter/imports/new.html.erb +88 -0
data/app/views/data_porter/imports/show.html.erb +49 -0
data/config/database.yml +3 -0
data/config/routes.rb +12 -0
data/docs/SPEC.md +2012 -0
data/docs/UI.md +32 -0
data/docs/blog/001-why-build-a-data-import-engine.md +166 -0
data/docs/blog/002-scaffolding-a-rails-engine.md +188 -0
data/docs/blog/003-configuration-dsl.md +222 -0
data/docs/blog/004-store-model-jsonb.md +237 -0
data/docs/blog/005-target-dsl.md +284 -0
data/docs/blog/006-parsing-csv-sources.md +300 -0
data/docs/blog/007-orchestrator.md +247 -0
data/docs/blog/008-actioncable-stimulus.md +376 -0
data/docs/blog/009-phlex-ui-components.md +446 -0
data/docs/blog/010-controllers-routing.md +374 -0
data/docs/blog/011-generators.md +364 -0
data/docs/blog/012-json-api-sources.md +323 -0
data/docs/blog/013-testing-rails-engine.md +618 -0
data/docs/blog/014-dry-run.md +307 -0
data/docs/blog/015-publishing-retro.md +264 -0
data/docs/blog/016-erb-view-templates.md +431 -0
data/docs/blog/017-showcase-final-retro.md +220 -0
data/docs/blog/BACKLOG.md +8 -0
data/docs/blog/SERIES.md +154 -0
data/docs/screenshots/index-with-previewing.jpg +0 -0
data/docs/screenshots/index.jpg +0 -0
data/docs/screenshots/modal-new-import.jpg +0 -0
data/docs/screenshots/preview.jpg +0 -0
data/lib/data_porter/broadcaster.rb +29 -0
data/lib/data_porter/components/base.rb +10 -0
data/lib/data_porter/components/failure_alert.rb +20 -0
data/lib/data_porter/components/preview_table.rb +54 -0
data/lib/data_porter/components/progress_bar.rb +33 -0
data/lib/data_porter/components/results_summary.rb +19 -0
data/lib/data_porter/components/status_badge.rb +16 -0
data/lib/data_porter/components/summary_cards.rb +30 -0
data/lib/data_porter/components.rb +14 -0
data/lib/data_porter/configuration.rb +25 -0
data/lib/data_porter/dsl/api_config.rb +25 -0
data/lib/data_porter/dsl/column.rb +17 -0
data/lib/data_porter/engine.rb +15 -0
data/lib/data_porter/orchestrator.rb +141 -0
data/lib/data_porter/record_validator.rb +32 -0
data/lib/data_porter/registry.rb +33 -0
data/lib/data_porter/sources/api.rb +49 -0
data/lib/data_porter/sources/base.rb +35 -0
data/lib/data_porter/sources/csv.rb +43 -0
data/lib/data_porter/sources/json.rb +45 -0
data/lib/data_porter/sources.rb +20 -0
data/lib/data_porter/store_models/error.rb +13 -0
data/lib/data_porter/store_models/import_record.rb +52 -0
data/lib/data_porter/store_models/report.rb +21 -0
data/lib/data_porter/target.rb +89 -0
data/lib/data_porter/type_validator.rb +46 -0
data/lib/data_porter/version.rb +5 -0
data/lib/data_porter.rb +32 -0
data/lib/generators/data_porter/install/install_generator.rb +33 -0
data/lib/generators/data_porter/install/templates/create_data_porter_imports.rb.erb +21 -0
data/lib/generators/data_porter/install/templates/initializer.rb +30 -0
data/lib/generators/data_porter/target/target_generator.rb +44 -0
data/lib/generators/data_porter/target/templates/target.rb.tt +20 -0
data/sig/data_porter.rbs +4 -0
metadata +274 -0

data/docs/blog/014-dry-run.md ADDED Viewed

@@ -0,0 +1,307 @@
+---
+title: "Building DataPorter #14 -- Dry Run : Valider avant d'importer"
+series: "Building DataPorter - A Data Import Engine for Rails"
+part: 14
+tags: [ruby, rails, rails-engine, gem-development, dry-run, validation, store-model, activejob]
+published: false
+---
+# Dry Run : Valider avant d'importer
+> Le preview attrape les erreurs de colonnes. Le dry run attrape les erreurs de base de donnees. Deux filets de securite, deux niveaux de confiance.
+## Context
+This is part 14 of the series where we build **DataPorter**, a mountable Rails engine for data import workflows. In [part 13](#), we have detailed the testing strategy: in-memory SQLite, structural controller specs, anonymous target classes, and a spec_helper that bootstraps just enough Rails to cover every layer.
+We have now a complete import pipeline: parse the file, preview the records, confirm, persist. But there is a gap between the preview and the real import. The preview validates the *data* -- required fields, types, formats. It does not validate what happens when that data hits the database. A uniqueness constraint, a foreign key violation, a custom model validation that queries other tables -- none of these surface until `persist` is actually called. By then, the import is running for real.
+In this article, we build a **dry run** mode that bridges this gap. It runs the full persist logic inside a transaction, captures any database-level errors on each record, then rolls back. The user sees exactly which records would fail *before* committing to the import.
+## Why two validation layers
+The preview phase runs the RecordValidator against column definitions. It catches structural problems: "this field is required and it is empty", "this field should be an email but it is not". These are fast, stateless checks that do not touch the database.
+But many real-world validations are stateful. A `validates_uniqueness_of :email` on the User model requires a database query. A `belongs_to :company` with a foreign key constraint requires the company to exist. A custom validation that checks `if: -> { some_scope.exists? }` requires the full ActiveRecord context. None of these can run during preview because there is no model instance, no transaction, no database connection in the validation path.
+The dry run fills this gap. It calls the target's `persist` method -- the same method that the real import uses -- but captures exceptions instead of letting them propagate. Each record gets annotated with its result: passed or failed, with the error message attached.
+## The `dry_run_enabled` DSL flag
+Not every import needs a dry run. A simple CSV-to-table import with no uniqueness constraints might not benefit from the overhead. We make it opt-in at the target level:
+```ruby
+# app/importers/user_import.rb
+class UserImport < DataPorter::Target
+  label "Users"
+  model_name "User"
+  dry_run_enabled
+  columns do
+    column :email, type: :email, required: true
+    column :name, type: :string, required: true
+  end
+  def persist(record, context:)
+    User.create!(record.attributes)
+  end
+end
+```
+The `dry_run_enabled` class method is a simple flag on the Target DSL:
+```ruby
+# lib/data_porter/target.rb
+class << self
+  attr_reader :_dry_run_enabled
+  def dry_run_enabled
+    @_dry_run_enabled = true
+  end
+end
+```
+No arguments, no block, no configuration. Either the target supports dry run or it does not. The controller checks `target_class._dry_run_enabled` to decide whether to show the "Dry Run" button on the preview page.
+## The `dry_running` status
+The DataImport enum needed a new state. The import transitions from `previewing` to `dry_running` while the dry run executes, then back to `previewing` when it completes:
+```ruby
+# app/models/data_porter/data_import.rb
+enum :status, {
+  pending: 0,
+  parsing: 1,
+  previewing: 2,
+  importing: 3,
+  completed: 4,
+  failed: 5,
+  dry_running: 6
+}
+```
+The value `6` is appended at the end rather than inserted in logical order. This is intentional -- existing records in production have integer status values. Inserting `dry_running` at position 3 would shift `importing`, `completed`, and `failed`, corrupting every existing import. Enums with integer backing must be append-only.
+The state flow is: `previewing -> dry_running -> previewing`. The dry run is not a terminal state. It enriches the records with database-level feedback and returns to the preview so the user can review the results and decide whether to proceed with the real import.
+## The `dry_run!` flow
+The Orchestrator gains a third public method alongside `parse!` and `import!`:
+```ruby
+# lib/data_porter/orchestrator.rb
+def dry_run!
+  @data_import.dry_running!
+  run_dry_run_records
+  @data_import.update!(status: :previewing)
+  build_report
+rescue StandardError => e
+  handle_failure(e)
+end
+```
+The structure mirrors `parse!` and `import!`: transition to the working status, do the work, transition to the result status, rebuild the report. The `rescue` catches catastrophic failures and transitions to `failed` with an error report.
+The real work happens in `run_dry_run_records`:
+```ruby
+def run_dry_run_records
+  records = @data_import.records
+  importable = records.select(&:importable?)
+  context = build_context
+  importable.each do |record|
+    dry_run_record(record, context)
+  end
+  @data_import.records_will_change!
+  @data_import.update!(records: records)
+end
+def dry_run_record(record, context)
+  @target.persist(record, context: context)
+  record.dry_run_passed = true
+rescue StandardError => e
+  record.dry_run_passed = false
+  record.add_error(e.message)
+end
+```
+For each importable record, we call the target's actual `persist` method. If it succeeds, `dry_run_passed` is set to `true`. If it raises -- an `ActiveRecord::RecordInvalid`, a constraint violation, any exception -- we capture the message on the record and mark it as failed. The import data is never committed because the dry run operates at the record level, catching errors individually.
+The `dry_run_passed` attribute on ImportRecord is a simple boolean:
+```ruby
+# lib/data_porter/store_models/import_record.rb
+attribute :dry_run_passed, :boolean, default: false
+```
+After the dry run, the preview table can show a green check or a red cross next to each record, along with the specific error message for failures. The user gets a precise map of what will work and what will not.
+## The StoreModel dirty tracking gotcha
+There is a subtle but critical line in `run_dry_run_records`:
+```ruby
+@data_import.records_will_change!
+@data_import.update!(records: records)
+```
+Why `records_will_change!` before `update!`? The answer lies in how ActiveRecord tracks changes on complex attributes.
+StoreModel attributes are serialized to JSON and stored in a text (or JSONB) column. When you modify an object *in place* -- setting `record.dry_run_passed = true` on a record that already exists in the `records` array -- ActiveRecord does not detect the change. From its perspective, the `records` attribute still points to the same Ruby array at the same memory address. The serialized value has changed, but ActiveRecord's dirty tracking compares object identity, not serialized content.
+Without `records_will_change!`, the `update!` call would see "records has not changed" and skip the column in the SQL UPDATE. The dry run results would be computed correctly in memory but never persisted to the database. The user would see no change on the preview page.
+`records_will_change!` explicitly marks the attribute as dirty, forcing ActiveRecord to include it in the next save. This is a well-known pattern with serialized attributes, but it is easy to forget -- and the failure mode is silent. The data looks correct in the current process, the tests that do not reload from the database pass, and only the production user sees stale results.
+This is one of those bugs that TDD catches early. The spec reloads the import from the database and checks `dry_run_passed` on the reloaded records:
+```ruby
+it "marks records as dry_run_passed on success" do
+  DataPorter::Orchestrator.new(import.reload).dry_run!
+  import.reload.records.each do |record|
+    expect(record.dry_run_passed).to be true
+  end
+end
+```
+The `import.reload` forces a fresh read from SQLite. Without `records_will_change!`, this spec fails -- `dry_run_passed` is still `false` in the database even though it was set to `true` in memory.
+## DryRunJob
+Like `parse!` and `import!`, the dry run executes asynchronously via a dedicated job:
+```ruby
+# app/jobs/data_porter/dry_run_job.rb
+class DryRunJob < ActiveJob::Base
+  queue_as { DataPorter.configuration.queue_name }
+  def perform(import_id)
+    data_import = DataImport.find(import_id)
+    Orchestrator.new(data_import).dry_run!
+  end
+end
+```
+Same pattern as ParseJob and ImportJob: find the import, delegate to the Orchestrator. The job itself has no logic -- it is a one-liner that bridges the async boundary.
+## Controller action and route
+The controller gains a `dry_run` action:
+```ruby
+# app/controllers/data_porter/imports_controller.rb
+before_action :set_import, only: %i[show parse confirm cancel dry_run]
+def dry_run
+  DataPorter::DryRunJob.perform_later(@import.id)
+  redirect_to import_path(@import)
+end
+```
+And the route:
+```ruby
+resources :imports, only: %i[index new create show] do
+  member do
+    post :parse
+    post :confirm
+    post :cancel
+    post :dry_run
+  end
+end
+```
+The pattern is identical to the other member actions: POST triggers a side effect (enqueue a job), redirect back to the show page where ActionCable will push progress updates. The view conditionally shows the "Dry Run" button only when `target_class._dry_run_enabled` is true and the import is in `previewing` status.
+## Testing
+The dry run specs follow the series' established patterns -- anonymous target classes, registry cleanup, and database round-trip assertions:
+```ruby
+RSpec.describe "Dry Run" do
+  let(:target_class) do
+    klass = Class.new(DataPorter::Target) do
+      label "Guests"
+      model_name "Guest"
+      dry_run_enabled
+      columns do
+        column :first_name, type: :string, required: true
+        column :last_name, type: :string
+      end
+    end
+    klass.define_method(:persist) do |record, context:|
+      record
+    end
+    klass
+  end
+  describe "Orchestrator#dry_run!" do
+    it "transitions to previewing after dry run" do
+      DataPorter::Orchestrator.new(import.reload).dry_run!
+      expect(import.reload.status).to eq("previewing")
+    end
+    it "marks records as dry_run_passed on success" do
+      DataPorter::Orchestrator.new(import.reload).dry_run!
+      import.reload.records.each do |record|
+        expect(record.dry_run_passed).to be true
+      end
+    end
+    it "captures errors from failing persist" do
+      # Target that raises ActiveRecord::RecordInvalid
+      DataPorter::Orchestrator.new(failing_import.reload).dry_run!
+      record = failing_import.reload.records.first
+      expect(record.dry_run_passed).to be false
+      expect(record.errors_list.map(&:message)).to include(match(/Validation failed/))
+    end
+  end
+end
+```
+The failing target class simulates a database-level error by raising `ActiveRecord::RecordInvalid` in `persist`. The spec verifies that the error is captured on the record, that `dry_run_passed` is false, and that the import still transitions to `previewing` -- not `failed`. A record-level error is expected operational feedback, not a catastrophic failure.
+The DryRunJob spec verifies delegation:
+```ruby
+describe "DryRunJob" do
+  it "calls Orchestrator#dry_run!" do
+    orchestrator = instance_double(DataPorter::Orchestrator, dry_run!: nil)
+    allow(DataPorter::Orchestrator).to receive(:new).and_return(orchestrator)
+    DataPorter::DryRunJob.new.perform(import.id)
+    expect(orchestrator).to have_received(:dry_run!)
+  end
+end
+```
+## Decisions & tradeoffs
+| Decision | We chose | Over | Because |
+|----------|----------|------|---------|
+| Opt-in flag | `dry_run_enabled` on Target DSL | Always-on dry run | Not every import benefits from the overhead; simple imports can skip it |
+| Status value | Append `dry_running: 6` at the end of the enum | Insert in logical order | Integer-backed enums must be append-only to avoid corrupting existing data |
+| Dirty tracking | Explicit `records_will_change!` | Reassigning the array (`self.records = records.dup`) | More explicit about intent; avoids unnecessary array duplication; documents the StoreModel gotcha |
+| Error boundary | Per-record rescue in `dry_run_record` | Wrapping all records in a single begin/rescue | One failing record should not prevent the others from being validated |
+## Recap
+- The **dry run** bridges the gap between preview (column-level validation) and real import (database-level validation), giving users a complete picture before any data is committed.
+- The **`dry_run_enabled` DSL flag** makes it opt-in per target -- not every import needs the overhead.
+- The **`dry_running` status** follows the append-only rule for integer-backed enums, preserving existing data.
+- The **`records_will_change!` call** is the key to making StoreModel in-place mutations persist -- without it, ActiveRecord skips the attribute in the SQL UPDATE because its dirty tracking does not detect in-place changes on serialized objects.
+- The **DryRunJob** follows the same thin-job pattern as ParseJob and ImportJob: find, delegate, done.
+- The **controller action and route** mirror the existing member actions: POST triggers a job, redirect back to show.
+## Next up
+We now have a full-featured, tested, and safe import engine. In part 15, we wrap up the series: **publishing the gem to RubyGems**, writing a proper CHANGELOG, choosing a versioning strategy, and reflecting on what worked, what we would do differently, and what DataPorter looks like from the outside.
+---
+*This is part 14 of the series "Building DataPorter - A Data Import Engine for Rails". [Previous: Testing a Rails Engine with RSpec](#) | [Next: Publishing the Gem & Retrospective](#)*

data/docs/blog/015-publishing-retro.md ADDED Viewed

@@ -0,0 +1,264 @@
+---
+title: "Building DataPorter #15 -- Publication et retrospective"
+series: "Building DataPorter - A Data Import Engine for Rails"
+part: 15
+tags: [ruby, rails, rails-engine, gem-development, rubygems, retrospective, open-source]
+published: false
+---
+# Publication et retrospective
+> De `bundle gem` a `gem push` : retour sur 14 articles, 20 composants, et les lecons apprises en construisant un Rails engine de A a Z avec TDD.
+## Context
+This is the final article in the series where we build **DataPorter**, a mountable Rails engine for data import workflows. In [part 14](#), we added Dry Run mode -- the last safety net before data touches the database.
+We started this series with a question: why do we keep rebuilding the same import workflow in every Rails app? Fourteen articles later, we have a published gem that answers it. This article covers the last mile -- publishing to RubyGems -- then steps back to look at what we built, what we learned, and what we would do differently.
+## Publishing the gem
+### Le gemspec final
+The gemspec is the identity card of a Ruby gem. Everything RubyGems needs to index, display, and resolve dependencies lives here. Here is ours in its final form:
+```ruby
+# data_porter.gemspec
+Gem::Specification.new do |spec|
+  spec.name = "data_porter"
+  spec.version = DataPorter::VERSION
+  spec.authors = ["Seryl Lounis"]
+  spec.email = ["seryllounis@outlook.fr"]
+  spec.summary = "Rails engine for multi-step data imports with preview"
+  spec.description = "A mountable Rails engine providing a complete data import workflow: " \
+                     "upload/configure, preview with validation, and import. " \
+                     "Supports CSV, JSON, and API sources with a simple DSL for defining import targets."
+  spec.homepage = "https://github.com/SerylLns/data_porter"
+  spec.license = "MIT"
+  spec.required_ruby_version = ">= 3.2.0"
+  spec.metadata["homepage_uri"] = spec.homepage
+  spec.metadata["source_code_uri"] = "https://github.com/SerylLns/data_porter"
+  spec.metadata["changelog_uri"] = "https://github.com/SerylLns/data_porter/blob/master/CHANGELOG.md"
+  spec.metadata["rubygems_mfa_required"] = "true"
+  # ...
+  spec.add_dependency "csv"
+  spec.add_dependency "phlex", ">= 1.0"
+  spec.add_dependency "rails", ">= 7.0"
+  spec.add_dependency "store_model", ">= 2.0"
+  spec.add_dependency "turbo-rails", ">= 1.0"
+end
+```
+Quelques points a noter. `rubygems_mfa_required` force l'authentification multi-facteur pour publier -- c'est devenu un standard pour tout gem open source serieux. Le `required_ruby_version` a `>= 3.2.0` exclut les versions de Ruby qui ne sont plus maintenues. Les dependances runtime sont volontairement larges (`>= 1.0`, `>= 7.0`) pour eviter de bloquer les host apps sur des versions specifiques.
+Le filtre `spec.files` exclut les fichiers de dev (`spec/`, `bin/`, `.github/`) pour que le gem publie ne contienne que le code de production. C'est important -- personne ne veut telecharger 2 Mo de specs quand il installe un gem.
+### Versioning
+DataPorter suit le semantic versioning :
+- **0.1.0** : premiere release. Le `0.x` indique clairement que l'API peut encore evoluer.
+- **0.x.y** : chaque nouvelle feature (un nouveau type de source, un nouveau composant) incremente le minor. Chaque bugfix incremente le patch.
+- **1.0.0** : viendra quand l'API sera stabilisee et testee en production sur plusieurs apps.
+Le numero de version vit dans un seul fichier :
+```ruby
+# lib/data_porter/version.rb
+module DataPorter
+  VERSION = "0.1.0"
+end
+```
+Un seul endroit a modifier. Le gemspec le lit avec `require_relative`. Le CHANGELOG le reference. Le tag Git le reprend. Pas de duplication.
+### Le workflow de publication
+```bash
+# 1. Mettre a jour la version
+# lib/data_porter/version.rb -> VERSION = "0.1.0"
+# 2. Mettre a jour le CHANGELOG
+# CHANGELOG.md -> ## [0.1.0] - 2026-02-06
+# 3. Commit, tag, push
+git add -A && git commit -m "Release v0.1.0"
+git tag v0.1.0
+git push origin master --tags
+# 4. Build et push
+gem build data_porter.gemspec
+gem push data_porter-0.1.0.gem
+```
+Ou, si le Rakefile est configure avec `bundler/gem_tasks` :
+```bash
+bundle exec rake release
+```
+Cette commande fait tout d'un coup : build, tag Git, push Git, push RubyGems. C'est la methode recommandee parce qu'elle garantit que le tag et le gem sont synchronises.
+## Documentation
+Un gem sans documentation est un gem que personne n'utilisera. DataPorter s'appuie sur trois niveaux de doc :
+**Le README** : point d'entree. Installation en une commande (`rails generate data_porter:install`), un exemple de Target en 15 lignes, le diagramme du workflow en trois etapes. Un developpeur doit pouvoir comprendre ce que fait le gem et l'installer en moins de 5 minutes.
+**Le CHANGELOG** : chaque release documentee avec ce qui a change, ce qui a ete ajoute, ce qui a casse. Format [Keep a Changelog](https://keepachangelog.com/) -- c'est un standard que la communaute Ruby connait.
+**Les commentaires inline** : chaque methode publique documentee avec YARD. Le DSL est le point le plus critique -- `column`, `sources`, `csv_mapping`, `persist` doivent etre documentes avec des exemples, parce que c'est ce que les utilisateurs liront le plus.
+## Ce qu'on a construit
+Voici la liste complete des composants qui forment DataPorter, dans l'ordre ou on les a construits :
+| # | Composant | Role |
+|---|-----------|------|
+| 1 | **Engine + isolate_namespace** | Structure du gem, isolation des noms |
+| 2 | **Configuration DSL** | `DataPorter.configure`, defaults, `context_builder` |
+| 3 | **StoreModels (ImportRecord, Error, Report)** | Structures JSONB typees sans tables supplementaires |
+| 4 | **TypeValidator** | Validation de types (email, phone, url) sur les colonnes |
+| 5 | **Target DSL** | `label`, `model`, `columns`, `sources`, `persist` |
+| 6 | **Registry** | Auto-decouverte et resolution des targets |
+| 7 | **Source::Base + Source::CSV** | Abstraction de sources, parsing CSV avec mapping |
+| 8 | **DataImport model** | ActiveRecord, enum status, polymorphic user |
+| 9 | **Orchestrator** | Coordination parse/import, gestion d'erreurs par record |
+| 10 | **RecordValidator** | Validations generiques (required, type) |
+| 11 | **ParseJob + ImportJob** | Background processing via ActiveJob |
+| 12 | **Broadcaster + ImportChannel** | Progression temps reel via ActionCable |
+| 13 | **7 composants Phlex** | StatusBadge, SummaryCards, PreviewTable, ProgressBar, ResultsSummary, FailureAlert |
+| 14 | **Stimulus controller** | Animation de la barre de progression cote client |
+| 15 | **ImportsController** | Heritage dynamique, 7 actions, Turbo integration |
+| 16 | **Install generator** | Migration, initializer, routes, repertoire importers |
+| 17 | **Target generator** | Scaffold de target avec parsing de colonnes |
+| 18 | **Source::JSON** | Import depuis fichier JSON ou texte brut |
+| 19 | **Source::API** | Import depuis endpoint HTTP avec auth et params |
+| 20 | **Dry Run** | Transaction + rollback, enrichissement des records avec erreurs DB |
+Vingt composants. Chacun avec ses specs. Chacun avec un article qui explique pourquoi il existe et comment il fonctionne.
+## L'architecture : le flux complet
+Voici ce qui se passe quand un utilisateur importe un fichier CSV, du debut a la fin :
+```
+Upload (Controller#create)
+  |
+  v
+Parse (ParseJob -> Orchestrator#parse!)
+  |-- Source::CSV.fetch -> raw rows
+  |-- Target.transform(record) -> transformation
+  |-- RecordValidator.validate(record) -> required, types
+  |-- Target.validate(record) -> business rules
+  |-- record.determine_status! -> complete/partial/missing
+  |-- Broadcaster -> ActionCable -> Stimulus -> progress bar
+  |
+  v
+Preview (Controller#show)
+  |-- PreviewTable(columns, records) -> tableau dynamique
+  |-- SummaryCards(report) -> compteurs par statut
+  |-- StatusBadge(status) -> badge "previewing"
+  |
+  v
+Dry Run (DryRunJob -> Orchestrator dans transaction + rollback)
+  |-- Enrichit les records avec les erreurs DB
+  |-- Broadcaster -> progression
+  |
+  v
+Import (ImportJob -> Orchestrator#import!)
+  |-- Target.persist(record, context:) -> par record
+  |-- rescue -> record.add_error, continue
+  |-- Target.after_import(results, context:)
+  |-- Broadcaster -> "completed"
+  |
+  v
+Results (Controller#show)
+  |-- ResultsSummary(report) -> imported/errored counts
+  |-- PreviewTable avec erreurs inline
+```
+Le gem possede l'infrastructure. La host app possede la logique metier. La separation est nette : un seul fichier Target et un initializer, c'est tout ce que la host app doit fournir.
+## Lecons apprises
+### TDD sans dummy app
+La decision la plus structurante de la serie : tester le engine sans creer d'application Rails dans `spec/dummy/`. Un `spec_helper.rb` de 60 lignes qui bootstrap SQLite en memoire, configure les load paths, et stub `ApplicationController`. Ca marche, et ca marche bien -- le suite tourne en moins d'une seconde.
+L'avantage inattendu : cette contrainte force a garder chaque composant decouple. Si un composant a besoin d'un router pour etre teste, c'est un signal qu'il est trop couple au framework. Les tests structurels sur les controllers (verifier l'heritage, les callbacks, les methodes) semblaient etranges au debut. Avec le recul, ils testent exactement ce que le gem possede -- le cablage -- et laissent les tests d'integration a la host app.
+Le piege a eviter : la duplication entre le schema dans `spec_helper.rb` et la migration template. Si les deux divergent, les tests passent mais la migration generee ne correspond pas a ce qui est teste. Un commentaire explicite dans le spec_helper rappelle cette dependance.
+### StoreModel : les gotchas
+StoreModel est puissant, mais il a ses subtilites :
+**Dirty tracking** : quand on modifie un objet a l'interieur d'un attribut `store_model`, ActiveRecord ne detecte pas le changement. On peut modifier `data_import.records.first.status = "complete"` et appeler `save` -- rien ne sera persiste. La solution : appeler `records_will_change!` avant de modifier, ou reassigner l'attribut entier avec `data_import.records = modified_records`.
+**Serialisation round-trip** : les cles symboles deviennent des cles string apres un save/reload. `{ name: "Alice" }` revient en `{ "name" => "Alice" }`. Il faut le savoir et coder en consequence -- soit toujours utiliser des string keys, soit appeler `symbolize_keys` a la sortie. DataPorter fait le second dans `ImportRecord#attributes`.
+**SQLite vs PostgreSQL** : en test, les colonnes StoreModel sont des `text`. En production, elles sont `jsonb`. StoreModel gere la difference de facon transparente, mais certaines requetes JSONB (indexes, contains) ne sont pas testables en SQLite. C'est un compromis acceptable pour la vitesse du feedback loop.
+### Phlex dans un engine : `plain` vs `text`
+Un piege specifique a Phlex : pour emettre du texte brut a l'interieur d'un element, il faut utiliser `plain` (pas `text`). Dans les premieres versions de Phlex, `text` existait mais a ete renomme. Si vous utilisez `text` avec une version recente, vous obtenez un `NoMethodError` cryptique. La SummaryCards le montre bien :
+```ruby
+def card(css_class, count, label)
+  div(class: "dp-card #{css_class}") do
+    strong { count.to_s }
+    plain " #{label}"   # pas text, pas p, juste du texte brut
+  end
+end
+```
+L'autre subtilite : appeler `super()` dans le `initialize` de chaque composant. Phlex l'exige, et l'oublier produit des erreurs silencieuses ou des rendus vides.
+### Patterns de test : controllers, channels, JS
+Tester du JavaScript depuis Ruby en lisant le fichier comme du texte et en assertant sur les strings -- ca semble hacky. En pratique, ca detecte la categorie de bugs la plus frequente dans un engine : le desalignement entre le code Ruby et le code JS. Le channel s'appelle `DataPorter::ImportChannel` en Ruby et `"DataPorter::ImportChannel"` en JS. Si l'un change et pas l'autre, le test echoue. Pour un seul fichier Stimulus de 30 lignes, ca vaut mieux que d'ajouter Jest et `node_modules` au projet.
+Les tests structurels de controllers (`_process_action_callbacks`, `instance_method`, `superclass`) forment un contrat : le gem garantit que le controller a la bonne forme. La host app garantit qu'il se comporte correctement dans son contexte. C'est une separation de responsabilites propre.
+## Et apres ?
+DataPorter 0.1.0 couvre le workflow standard. Voici ce qui pourrait venir dans les versions suivantes :
+**Batch imports** : pour les fichiers de 100k+ lignes, importer par lots de 1000 avec `insert_all` au lieu de `create!` record par record. Ca necessite de repenser le contrat de `persist` -- au lieu d'un record a la fois, le target recevrait un batch.
+**Streaming de progression** : remplacer ActionCable par Server-Sent Events (SSE) pour les apps qui n'ont pas besoin de WebSocket bidirectionnel. Plus leger, pas de Redis en dependance.
+**Validateurs custom** : permettre aux targets de declarer des validateurs avec un DSL :
+```ruby
+columns do
+  column :email, type: :email, required: true, validate: ->(val) {
+    "already exists" if User.exists?(email: val)
+  }
+end
+```
+**Export** : le chemin inverse. Si on sait parser et valider des records, on sait aussi les serialiser en CSV/JSON. Le Target a deja toute l'information necessaire (colonnes, types, labels).
+**Support Excel** : un `Source::Xlsx` qui s'appuie sur `roo` ou `creek` pour parser les fichiers `.xlsx`. Le pattern Source est la, il suffit d'implementer `fetch`.
+## Reflexion finale
+Construire DataPorter a ete un exercice de discipline autant que de code. La methode -- Taskmaster pour planifier, TDD pour implementer, un article pour documenter chaque etape -- force a prendre des decisions explicites. Pas de "on verra plus tard". Chaque composant existe parce qu'un test l'exige, et chaque test existe parce qu'un comportement a ete specifie.
+Le choix de ne pas utiliser de dummy app etait un pari. Il a paye : les tests sont rapides, les composants sont decouples, et le gem est testable sans infrastructure Rails. Mais il a un cout -- certains bugs d'integration ne seront detectes que dans la host app. C'est un tradeoff assume : le gem teste son cablage, la host app teste son comportement.
+StoreModel, Phlex, Stimulus -- chaque dependance a apporte sa part de surprises. Le dirty tracking de StoreModel, le `plain` vs `text` de Phlex, le nommage a double tiret de Stimulus pour les engines. Ces gotchas n'apparaissent dans aucune documentation. Ils apparaissent quand un test echoue a 23h et qu'on lit le code source du gem pour comprendre pourquoi. C'est ca, le vrai avantage du TDD : on decouvre les problemes dans le terminal, pas en production.
+DataPorter est maintenant un gem publie sur RubyGems. Un `bundle add data_porter`, un `rails generate data_porter:install`, un Target de 15 lignes, et n'importe quelle app Rails a un systeme d'import complet avec preview, validation, progression temps reel et dry run.
+C'etait le plan depuis le debut. Il aura fallu 15 articles pour y arriver.
+---
+*This is part 15 of the series "Building DataPorter - A Data Import Engine for Rails". [Previous: Dry Run: Validate Before You Persist](#)*