hekenga 2.0.0 → 2.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +8 -1
- data/CLAUDE.md +60 -0
- data/README.md +127 -5
- data/lib/hekenga/document_task.rb +3 -0
- data/lib/hekenga/document_task_executor.rb +5 -1
- data/lib/hekenga/version.rb +1 -1
- metadata +3 -2
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: d5a372f23cc2fe1751eb08e07e40705b0361fc609b072e14226b72819cfe9647
|
|
4
|
+
data.tar.gz: 99228c0f076660abe23c92f37c55509b156810fc0a741764b7872d4a0c597263
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: e2dfea01ce0c1c17fc51ab550e1805653a255d7c65a8b8424c7a1d6e646d59b9bc70fc45cfcf558d1fd7a6df81e5a2a8230a240e8c75d578736cdf19f4e2e290
|
|
7
|
+
data.tar.gz: 0af8e27a9f1b4c7ae490788f64457a67fae9a3fb9459821b8066bca61671ae30463971e9ec277d01ba358817992260281e5dceaaab676f9093dc05a941a520ce
|
data/CHANGELOG.md
CHANGED
|
@@ -1,8 +1,15 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## v2.1.0
|
|
4
|
+
|
|
5
|
+
- `per_document` task `scope` will no longer let you specify `.only` or
|
|
6
|
+
`.without` as it could potentially cause data loss.
|
|
7
|
+
- `per_document` task `scope` now correctly works with `.includes` even
|
|
8
|
+
in parallel execution mode.
|
|
9
|
+
|
|
3
10
|
## v2.0.0
|
|
4
11
|
|
|
5
|
-
- `Hekenga::Iterator` has been replaced by `Hekenga::IdIterator`. If any
|
|
12
|
+
- (breaking) `Hekenga::Iterator` has been replaced by `Hekenga::IdIterator`. If any
|
|
6
13
|
selector or sort is set on a document task migration scope, it no longer forces an
|
|
7
14
|
ascending ID sort. This should help to prevent index misses, though there is a
|
|
8
15
|
tradeoff that documents being concurrently updated may be skipped or
|
data/CLAUDE.md
ADDED
|
@@ -0,0 +1,60 @@
|
|
|
1
|
+
# CLAUDE.md
|
|
2
|
+
|
|
3
|
+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
4
|
+
|
|
5
|
+
## Project Overview
|
|
6
|
+
|
|
7
|
+
Hekenga is a Ruby gem providing a migration framework for MongoDB (via Mongoid). It supports sequential and parallel document processing via ActiveJob, with error recovery, validation tracking, and a Thor-based CLI.
|
|
8
|
+
|
|
9
|
+
## Common Commands
|
|
10
|
+
|
|
11
|
+
```bash
|
|
12
|
+
# Run full test suite (requires MongoDB - see docker-compose.yml)
|
|
13
|
+
rake spec
|
|
14
|
+
|
|
15
|
+
# Run a single spec file
|
|
16
|
+
rake spec SPEC=spec/hekenga/document_task_spec.rb
|
|
17
|
+
|
|
18
|
+
# Install gem locally
|
|
19
|
+
bundle exec rake install
|
|
20
|
+
|
|
21
|
+
# Interactive console with gem loaded
|
|
22
|
+
bin/console
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
## Architecture
|
|
26
|
+
|
|
27
|
+
### Migration Flow
|
|
28
|
+
|
|
29
|
+
```
|
|
30
|
+
Migration.perform! → MasterProcess.run! → launches tasks in threads
|
|
31
|
+
SimpleTask: executes up/down blocks directly
|
|
32
|
+
DocumentTask: iterates documents → batch → execute → write (sequential)
|
|
33
|
+
ParallelTask: splits into ID batches → enqueues ParallelJob per batch (via ActiveJob)
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
### Key Components
|
|
37
|
+
|
|
38
|
+
- **`Hekenga::Migration`** — main migration class, orchestrates tasks
|
|
39
|
+
- **`Hekenga::MasterProcess`** — launches tasks, manages execution/recovery/progress
|
|
40
|
+
- **`Hekenga::DSL::*`** — fluent DSL for defining migrations (`DSL::Migration`, `DSL::SimpleTask`, `DSL::DocumentTask`)
|
|
41
|
+
- **`Hekenga::DocumentTaskExecutor`** — core document processing: filter → up block → validate → write
|
|
42
|
+
- **`Hekenga::ParallelTask`** / **`Hekenga::ParallelJob`** — parallel execution via ActiveJob
|
|
43
|
+
- **`Hekenga::DocumentTaskRecord`** — Mongoid doc tracking parallel task progress
|
|
44
|
+
- **`Hekenga::Log`** — Mongoid doc tracking migration/task status (`:naught`, `:running`, `:complete`, `:failed`, `:skipped`)
|
|
45
|
+
- **`Hekenga::Failure::*`** — error/validation/write/cancelled failure tracking subclasses
|
|
46
|
+
- **`Hekenga::IdIterator`** / **`Hekenga::MongoidIterator`** — efficient document iteration for parallel vs sequential paths
|
|
47
|
+
|
|
48
|
+
### Task Types
|
|
49
|
+
|
|
50
|
+
- **SimpleTask** — one-off up/down blocks, no document iteration
|
|
51
|
+
- **DocumentTask** — per-document processing with scope, filter, setup, up, down, after blocks; supports `parallel!`, `timeless!`, `always_write!`, `use_transaction!`, configurable write strategies (`:update` vs `:delete_then_insert`)
|
|
52
|
+
|
|
53
|
+
### Configuration
|
|
54
|
+
|
|
55
|
+
Via `Hekenga.configure` block — sets migration directory and report frequency. Thread-safe registry tracks all migrations.
|
|
56
|
+
|
|
57
|
+
## Dependencies
|
|
58
|
+
|
|
59
|
+
- **mongoid** (>= 6), **activejob** (>= 5), **thor** (1.2.1)
|
|
60
|
+
- Test: **rspec** (~> 3.0), **database_cleaner-mongoid** (~> 2.0), **pry**
|
data/README.md
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
# Hekenga
|
|
2
2
|
|
|
3
|
-
|
|
4
|
-
processing via ActiveJob, chained jobs and error recovery.
|
|
3
|
+
A migration framework for MongoDB (via Mongoid) that supports parallel document
|
|
4
|
+
processing via ActiveJob, chained jobs, and error recovery.
|
|
5
5
|
|
|
6
6
|
## Installation
|
|
7
7
|
|
|
@@ -19,13 +19,135 @@ Or install it yourself as:
|
|
|
19
19
|
|
|
20
20
|
$ gem install hekenga
|
|
21
21
|
|
|
22
|
+
## Configuration
|
|
23
|
+
|
|
24
|
+
```ruby
|
|
25
|
+
Hekenga.configure do |config|
|
|
26
|
+
config.dir = ["db", "hekenga"] # where migration files live (relative to root)
|
|
27
|
+
config.root = Dir.pwd # application root
|
|
28
|
+
end
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
Migrations are stored as Ruby files in the configured directory (default: `db/hekenga/`).
|
|
32
|
+
|
|
22
33
|
## Usage
|
|
23
34
|
|
|
24
|
-
CLI
|
|
35
|
+
### CLI
|
|
36
|
+
|
|
37
|
+
```
|
|
38
|
+
$ hekenga help # Show all available commands
|
|
39
|
+
$ hekenga generate <description> # Generate a new migration scaffold
|
|
40
|
+
$ hekenga status # Show status of all migrations
|
|
41
|
+
$ hekenga run_all! # Run all pending migrations in date order
|
|
42
|
+
$ hekenga run! <path_or_pkey> # Run a specific migration
|
|
43
|
+
$ hekenga run! <path_or_pkey> --test # Dry run (no writes persisted)
|
|
44
|
+
$ hekenga run! <path_or_pkey> --clear # Clear logs before running
|
|
45
|
+
$ hekenga recover! <path_or_pkey> # Re-process failed/invalid records
|
|
46
|
+
$ hekenga cancel # Cancel all active migrations
|
|
47
|
+
$ hekenga skip <path_or_pkey> # Mark a migration as skipped
|
|
48
|
+
$ hekenga clear! <path_or_pkey> # Remove all logs/failures for a migration
|
|
49
|
+
$ hekenga cleanup # Remove all failure logs
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
### Writing Migrations
|
|
53
|
+
|
|
54
|
+
Generate a migration scaffold:
|
|
55
|
+
|
|
56
|
+
$ hekenga generate "Add default role to users"
|
|
57
|
+
|
|
58
|
+
#### Simple Tasks
|
|
59
|
+
|
|
60
|
+
Simple tasks run arbitrary code once. Use `actual?` and `test?` to check execution mode.
|
|
61
|
+
|
|
62
|
+
```ruby
|
|
63
|
+
Hekenga.migration do
|
|
64
|
+
description "Backfill analytics collection"
|
|
65
|
+
created "2024-01-15 10:00"
|
|
66
|
+
|
|
67
|
+
task "Create indexes" do
|
|
68
|
+
up do
|
|
69
|
+
Analytics.create_indexes if actual?
|
|
70
|
+
end
|
|
71
|
+
end
|
|
72
|
+
end
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
#### Document Tasks
|
|
76
|
+
|
|
77
|
+
Document tasks iterate over a Mongoid scope and process each document in batches.
|
|
78
|
+
|
|
79
|
+
```ruby
|
|
80
|
+
Hekenga.migration do
|
|
81
|
+
description "Normalize user emails"
|
|
82
|
+
created "2024-01-15 10:00"
|
|
83
|
+
batch_size 100 # default batch size for all tasks in this migration
|
|
84
|
+
|
|
85
|
+
per_document "Downcase emails" do
|
|
86
|
+
scope User.all
|
|
87
|
+
|
|
88
|
+
# Called once per batch; instance variables are shared with filter/up/after
|
|
89
|
+
setup do |docs|
|
|
90
|
+
@domain_map = ExternalService.load_domains
|
|
91
|
+
end
|
|
92
|
+
|
|
93
|
+
# Return false to skip a document
|
|
94
|
+
filter do |doc|
|
|
95
|
+
doc.email.present?
|
|
96
|
+
end
|
|
97
|
+
|
|
98
|
+
# Mutate the document in place — Hekenga handles persistence
|
|
99
|
+
up do |doc|
|
|
100
|
+
doc.email = doc.email.downcase
|
|
101
|
+
end
|
|
102
|
+
|
|
103
|
+
# Called once per batch with the successfully written documents
|
|
104
|
+
after do |docs|
|
|
105
|
+
AuditLog.record(docs.map(&:id))
|
|
106
|
+
end
|
|
107
|
+
end
|
|
108
|
+
end
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
#### Document Task Options
|
|
112
|
+
|
|
113
|
+
```ruby
|
|
114
|
+
per_document "Process records" do
|
|
115
|
+
scope MyModel.where(active: true)
|
|
116
|
+
|
|
117
|
+
parallel! # Process batches in parallel via ActiveJob
|
|
118
|
+
timeless! # Don't update Mongoid timestamps
|
|
119
|
+
always_write! # Write even if the document didn't change
|
|
120
|
+
skip_prepare! # Skip Mongoid callbacks on load
|
|
121
|
+
use_transaction! # Wrap each batch in a MongoDB transaction
|
|
122
|
+
batch_size 50 # Override migration-level batch size
|
|
123
|
+
write_strategy :update # :update (default) or :delete_then_insert
|
|
124
|
+
cursor_timeout 86_400 # Max cursor lifetime in seconds (default: 1 day)
|
|
125
|
+
|
|
126
|
+
up do |doc|
|
|
127
|
+
doc.status = "migrated"
|
|
128
|
+
end
|
|
129
|
+
end
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
### Test Mode
|
|
133
|
+
|
|
134
|
+
Run a migration without persisting changes:
|
|
135
|
+
|
|
136
|
+
```ruby
|
|
137
|
+
migration = Hekenga.find_migration("2024-01-15-add-default-role-to-users")
|
|
138
|
+
migration.test_mode!
|
|
139
|
+
migration.perform!
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
Or via the CLI:
|
|
143
|
+
|
|
144
|
+
$ hekenga run! <path_or_pkey> --test
|
|
145
|
+
|
|
146
|
+
### Recovery
|
|
25
147
|
|
|
26
|
-
|
|
148
|
+
When a migration fails (due to errors, invalid records, or write failures), Hekenga logs the failures and marks the migration as failed. You can re-process only the failed records:
|
|
27
149
|
|
|
28
|
-
|
|
150
|
+
$ hekenga recover! <path_or_pkey>
|
|
29
151
|
|
|
30
152
|
## Development
|
|
31
153
|
|
|
@@ -24,6 +24,9 @@ module Hekenga
|
|
|
24
24
|
|
|
25
25
|
def validate!
|
|
26
26
|
raise Hekenga::Invalid.new(self, :ups, "missing") unless ups.any?
|
|
27
|
+
if scope&.options&.key?(:fields)
|
|
28
|
+
raise Hekenga::Invalid.new(self, :scope, "uses .only() or .without() which would cause data loss with replace_one")
|
|
29
|
+
end
|
|
27
30
|
end
|
|
28
31
|
|
|
29
32
|
def up!(context, document)
|
|
@@ -59,7 +59,11 @@ module Hekenga
|
|
|
59
59
|
end
|
|
60
60
|
|
|
61
61
|
def record_scope
|
|
62
|
-
task.scope.klass.unscoped.in(_id: task_record.ids)
|
|
62
|
+
scope = task.scope.klass.unscoped.in(_id: task_record.ids)
|
|
63
|
+
if task.scope.inclusions.any?
|
|
64
|
+
scope = scope.includes(*task.scope.inclusions.map(&:name))
|
|
65
|
+
end
|
|
66
|
+
scope
|
|
63
67
|
end
|
|
64
68
|
|
|
65
69
|
def records
|
data/lib/hekenga/version.rb
CHANGED
metadata
CHANGED
|
@@ -1,14 +1,14 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: hekenga
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 2.
|
|
4
|
+
version: 2.1.0
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Tapio Saarinen
|
|
8
8
|
autorequire:
|
|
9
9
|
bindir: exe
|
|
10
10
|
cert_chain: []
|
|
11
|
-
date:
|
|
11
|
+
date: 2026-04-23 00:00:00.000000000 Z
|
|
12
12
|
dependencies:
|
|
13
13
|
- !ruby/object:Gem::Dependency
|
|
14
14
|
name: bundler
|
|
@@ -148,6 +148,7 @@ files:
|
|
|
148
148
|
- ".rspec"
|
|
149
149
|
- ".travis.yml"
|
|
150
150
|
- CHANGELOG.md
|
|
151
|
+
- CLAUDE.md
|
|
151
152
|
- Gemfile
|
|
152
153
|
- README.md
|
|
153
154
|
- Rakefile
|