hekenga 2.0.0 → 2.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: f21c3c1cb0e45c3b9eb2627fc960d032ff981e01b239dc69c02d2aebc1f7b539
4
- data.tar.gz: 62fbbd8a65bae8bacc537bcc40cc4a3960b795ebb19a13c34b8ebd539eb102d0
3
+ metadata.gz: d5a372f23cc2fe1751eb08e07e40705b0361fc609b072e14226b72819cfe9647
4
+ data.tar.gz: 99228c0f076660abe23c92f37c55509b156810fc0a741764b7872d4a0c597263
5
5
  SHA512:
6
- metadata.gz: 6317b298a05085564cfaaee1beef6b5749e83f8bc877538b448db66b50c3dde50f0380a5fa7f178dff7feca6840564c486f263891a838d4ed51617b7ff4f8698
7
- data.tar.gz: 12ba1c564acca2f7d43c80a384a0767b00592abf5126d0016a0bf7cadf610e6d342890cd54597a602654e8449491135a3dda488bb7c1c8aaeb72d035bdc7c5d6
6
+ metadata.gz: e2dfea01ce0c1c17fc51ab550e1805653a255d7c65a8b8424c7a1d6e646d59b9bc70fc45cfcf558d1fd7a6df81e5a2a8230a240e8c75d578736cdf19f4e2e290
7
+ data.tar.gz: 0af8e27a9f1b4c7ae490788f64457a67fae9a3fb9459821b8066bca61671ae30463971e9ec277d01ba358817992260281e5dceaaab676f9093dc05a941a520ce
data/CHANGELOG.md CHANGED
@@ -1,8 +1,15 @@
1
1
  # Changelog
2
2
 
3
+ ## v2.1.0
4
+
5
+ - `per_document` task `scope` will no longer let you specify `.only` or
6
+ `.without` as it could potentially cause data loss.
7
+ - `per_document` task `scope` now correctly works with `.includes` even
8
+ in parallel execution mode.
9
+
3
10
  ## v2.0.0
4
11
 
5
- - `Hekenga::Iterator` has been replaced by `Hekenga::IdIterator`. If any
12
+ - (breaking) `Hekenga::Iterator` has been replaced by `Hekenga::IdIterator`. If any
6
13
  selector or sort is set on a document task migration scope, it no longer forces an
7
14
  ascending ID sort. This should help to prevent index misses, though there is a
8
15
  tradeoff that documents being concurrently updated may be skipped or
data/CLAUDE.md ADDED
@@ -0,0 +1,60 @@
1
+ # CLAUDE.md
2
+
3
+ This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4
+
5
+ ## Project Overview
6
+
7
+ Hekenga is a Ruby gem providing a migration framework for MongoDB (via Mongoid). It supports sequential and parallel document processing via ActiveJob, with error recovery, validation tracking, and a Thor-based CLI.
8
+
9
+ ## Common Commands
10
+
11
+ ```bash
12
+ # Run full test suite (requires MongoDB - see docker-compose.yml)
13
+ rake spec
14
+
15
+ # Run a single spec file
16
+ rake spec SPEC=spec/hekenga/document_task_spec.rb
17
+
18
+ # Install gem locally
19
+ bundle exec rake install
20
+
21
+ # Interactive console with gem loaded
22
+ bin/console
23
+ ```
24
+
25
+ ## Architecture
26
+
27
+ ### Migration Flow
28
+
29
+ ```
30
+ Migration.perform! → MasterProcess.run! → launches tasks in threads
31
+ SimpleTask: executes up/down blocks directly
32
+ DocumentTask: iterates documents → batch → execute → write (sequential)
33
+ ParallelTask: splits into ID batches → enqueues ParallelJob per batch (via ActiveJob)
34
+ ```
35
+
36
+ ### Key Components
37
+
38
+ - **`Hekenga::Migration`** — main migration class, orchestrates tasks
39
+ - **`Hekenga::MasterProcess`** — launches tasks, manages execution/recovery/progress
40
+ - **`Hekenga::DSL::*`** — fluent DSL for defining migrations (`DSL::Migration`, `DSL::SimpleTask`, `DSL::DocumentTask`)
41
+ - **`Hekenga::DocumentTaskExecutor`** — core document processing: filter → up block → validate → write
42
+ - **`Hekenga::ParallelTask`** / **`Hekenga::ParallelJob`** — parallel execution via ActiveJob
43
+ - **`Hekenga::DocumentTaskRecord`** — Mongoid doc tracking parallel task progress
44
+ - **`Hekenga::Log`** — Mongoid doc tracking migration/task status (`:naught`, `:running`, `:complete`, `:failed`, `:skipped`)
45
+ - **`Hekenga::Failure::*`** — error/validation/write/cancelled failure tracking subclasses
46
+ - **`Hekenga::IdIterator`** / **`Hekenga::MongoidIterator`** — efficient document iteration for parallel vs sequential paths
47
+
48
+ ### Task Types
49
+
50
+ - **SimpleTask** — one-off up/down blocks, no document iteration
51
+ - **DocumentTask** — per-document processing with scope, filter, setup, up, down, after blocks; supports `parallel!`, `timeless!`, `always_write!`, `use_transaction!`, configurable write strategies (`:update` vs `:delete_then_insert`)
52
+
53
+ ### Configuration
54
+
55
+ Via `Hekenga.configure` block — sets migration directory and report frequency. Thread-safe registry tracks all migrations.
56
+
57
+ ## Dependencies
58
+
59
+ - **mongoid** (>= 6), **activejob** (>= 5), **thor** (1.2.1)
60
+ - Test: **rspec** (~> 3.0), **database_cleaner-mongoid** (~> 2.0), **pry**
data/README.md CHANGED
@@ -1,7 +1,7 @@
1
1
  # Hekenga
2
2
 
3
- An attempt at a migration framework for MongoDB that supports parallel document
4
- processing via ActiveJob, chained jobs and error recovery.
3
+ A migration framework for MongoDB (via Mongoid) that supports parallel document
4
+ processing via ActiveJob, chained jobs, and error recovery.
5
5
 
6
6
  ## Installation
7
7
 
@@ -19,13 +19,135 @@ Or install it yourself as:
19
19
 
20
20
  $ gem install hekenga
21
21
 
22
+ ## Configuration
23
+
24
+ ```ruby
25
+ Hekenga.configure do |config|
26
+ config.dir = ["db", "hekenga"] # where migration files live (relative to root)
27
+ config.root = Dir.pwd # application root
28
+ end
29
+ ```
30
+
31
+ Migrations are stored as Ruby files in the configured directory (default: `db/hekenga/`).
32
+
22
33
  ## Usage
23
34
 
24
- CLI instructions:
35
+ ### CLI
36
+
37
+ ```
38
+ $ hekenga help # Show all available commands
39
+ $ hekenga generate <description> # Generate a new migration scaffold
40
+ $ hekenga status # Show status of all migrations
41
+ $ hekenga run_all! # Run all pending migrations in date order
42
+ $ hekenga run! <path_or_pkey> # Run a specific migration
43
+ $ hekenga run! <path_or_pkey> --test # Dry run (no writes persisted)
44
+ $ hekenga run! <path_or_pkey> --clear # Clear logs before running
45
+ $ hekenga recover! <path_or_pkey> # Re-process failed/invalid records
46
+ $ hekenga cancel # Cancel all active migrations
47
+ $ hekenga skip <path_or_pkey> # Mark a migration as skipped
48
+ $ hekenga clear! <path_or_pkey> # Remove all logs/failures for a migration
49
+ $ hekenga cleanup # Remove all failure logs
50
+ ```
51
+
52
+ ### Writing Migrations
53
+
54
+ Generate a migration scaffold:
55
+
56
+ $ hekenga generate "Add default role to users"
57
+
58
+ #### Simple Tasks
59
+
60
+ Simple tasks run arbitrary code once. Use `actual?` and `test?` to check execution mode.
61
+
62
+ ```ruby
63
+ Hekenga.migration do
64
+ description "Backfill analytics collection"
65
+ created "2024-01-15 10:00"
66
+
67
+ task "Create indexes" do
68
+ up do
69
+ Analytics.create_indexes if actual?
70
+ end
71
+ end
72
+ end
73
+ ```
74
+
75
+ #### Document Tasks
76
+
77
+ Document tasks iterate over a Mongoid scope and process each document in batches.
78
+
79
+ ```ruby
80
+ Hekenga.migration do
81
+ description "Normalize user emails"
82
+ created "2024-01-15 10:00"
83
+ batch_size 100 # default batch size for all tasks in this migration
84
+
85
+ per_document "Downcase emails" do
86
+ scope User.all
87
+
88
+ # Called once per batch; instance variables are shared with filter/up/after
89
+ setup do |docs|
90
+ @domain_map = ExternalService.load_domains
91
+ end
92
+
93
+ # Return false to skip a document
94
+ filter do |doc|
95
+ doc.email.present?
96
+ end
97
+
98
+ # Mutate the document in place — Hekenga handles persistence
99
+ up do |doc|
100
+ doc.email = doc.email.downcase
101
+ end
102
+
103
+ # Called once per batch with the successfully written documents
104
+ after do |docs|
105
+ AuditLog.record(docs.map(&:id))
106
+ end
107
+ end
108
+ end
109
+ ```
110
+
111
+ #### Document Task Options
112
+
113
+ ```ruby
114
+ per_document "Process records" do
115
+ scope MyModel.where(active: true)
116
+
117
+ parallel! # Process batches in parallel via ActiveJob
118
+ timeless! # Don't update Mongoid timestamps
119
+ always_write! # Write even if the document didn't change
120
+ skip_prepare! # Skip Mongoid callbacks on load
121
+ use_transaction! # Wrap each batch in a MongoDB transaction
122
+ batch_size 50 # Override migration-level batch size
123
+ write_strategy :update # :update (default) or :delete_then_insert
124
+ cursor_timeout 86_400 # Max cursor lifetime in seconds (default: 1 day)
125
+
126
+ up do |doc|
127
+ doc.status = "migrated"
128
+ end
129
+ end
130
+ ```
131
+
132
+ ### Test Mode
133
+
134
+ Run a migration without persisting changes:
135
+
136
+ ```ruby
137
+ migration = Hekenga.find_migration("2024-01-15-add-default-role-to-users")
138
+ migration.test_mode!
139
+ migration.perform!
140
+ ```
141
+
142
+ Or via the CLI:
143
+
144
+ $ hekenga run! <path_or_pkey> --test
145
+
146
+ ### Recovery
25
147
 
26
- $ hekenga help
148
+ When a migration fails (due to errors, invalid records, or write failures), Hekenga logs the failures and marks the migration as failed. You can re-process only the failed records:
27
149
 
28
- Migration DSL documentation TBD, for now please look at spec/
150
+ $ hekenga recover! <path_or_pkey>
29
151
 
30
152
  ## Development
31
153
 
@@ -24,6 +24,9 @@ module Hekenga
24
24
 
25
25
  def validate!
26
26
  raise Hekenga::Invalid.new(self, :ups, "missing") unless ups.any?
27
+ if scope&.options&.key?(:fields)
28
+ raise Hekenga::Invalid.new(self, :scope, "uses .only() or .without() which would cause data loss with replace_one")
29
+ end
27
30
  end
28
31
 
29
32
  def up!(context, document)
@@ -59,7 +59,11 @@ module Hekenga
59
59
  end
60
60
 
61
61
  def record_scope
62
- task.scope.klass.unscoped.in(_id: task_record.ids)
62
+ scope = task.scope.klass.unscoped.in(_id: task_record.ids)
63
+ if task.scope.inclusions.any?
64
+ scope = scope.includes(*task.scope.inclusions.map(&:name))
65
+ end
66
+ scope
63
67
  end
64
68
 
65
69
  def records
@@ -1,3 +1,3 @@
1
1
  module Hekenga
2
- VERSION = "2.0.0"
2
+ VERSION = "2.1.0"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: hekenga
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.0.0
4
+ version: 2.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Tapio Saarinen
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2024-07-31 00:00:00.000000000 Z
11
+ date: 2026-04-23 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -148,6 +148,7 @@ files:
148
148
  - ".rspec"
149
149
  - ".travis.yml"
150
150
  - CHANGELOG.md
151
+ - CLAUDE.md
151
152
  - Gemfile
152
153
  - README.md
153
154
  - Rakefile