familia 2.5.0 → 2.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (37) hide show
  1. checksums.yaml +4 -4
  2. data/.github/workflows/ci.yml +6 -1
  3. data/CHANGELOG.rst +144 -0
  4. data/CLAUDE.md +18 -0
  5. data/Gemfile.lock +3 -3
  6. data/changelog.d/20260514_034522_claude_review_familia_issue_217.rst +46 -0
  7. data/docs/guides/feature-housekeeping.md +217 -0
  8. data/docs/guides/index.md +5 -1
  9. data/lib/familia/atomic_operations.rb +86 -0
  10. data/lib/familia/data_type.rb +4 -0
  11. data/lib/familia/errors.rb +21 -0
  12. data/lib/familia/features/housekeeping.rb +101 -0
  13. data/lib/familia/features/relationships/indexing/rebuild_strategies.rb +6 -65
  14. data/lib/familia/horreum/atomic_write.rb +239 -0
  15. data/lib/familia/horreum/management/audit.rb +853 -31
  16. data/lib/familia/horreum/management/audit_report.rb +99 -8
  17. data/lib/familia/horreum/management/repair.rb +236 -15
  18. data/lib/familia/horreum/management.rb +1 -1
  19. data/lib/familia/horreum/persistence.rb +2 -0
  20. data/lib/familia/horreum.rb +2 -0
  21. data/lib/familia/version.rb +1 -1
  22. data/lib/familia.rb +1 -0
  23. data/try/audit/audit_cross_references_try.rb +502 -0
  24. data/try/audit/audit_delim_override_try.rb +185 -0
  25. data/try/audit/audit_instance_scoped_multi_index_try.rb +198 -0
  26. data/try/audit/audit_related_fields_try.rb +364 -0
  27. data/try/audit/audit_report_try.rb +72 -3
  28. data/try/audit/audit_unique_indexes_try.rb +11 -0
  29. data/try/audit/m3_multi_index_stub_try.rb +235 -50
  30. data/try/audit/repair_all_robustness_try.rb +149 -0
  31. data/try/audit/repair_related_fields_try.rb +417 -0
  32. data/try/features/atomic_write_coverage_try.rb +148 -0
  33. data/try/features/atomic_write_try.rb +463 -0
  34. data/try/features/housekeeping/housekeeping_try.rb +207 -0
  35. data/try/thread_safety/atomic_write_ownership_race_try.rb +130 -0
  36. data/try/unit/atomic_operations_try.rb +77 -0
  37. metadata +17 -1
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 216f0e8232c1783dd031a43efce97bef6d53d095942da725892fdb4e78feb114
4
- data.tar.gz: cc13a5871bce8ba8812e73142ef238e804cd67bbe6e672498feb2f72382d1a15
3
+ metadata.gz: 7b0138946cb9d48258b9f43b8a8a81aa895bfded213543feee354b3d383dd98a
4
+ data.tar.gz: bd5b6ee397c2ef19cef85fe7638ead34b5ed40a81c8362f539ed757d0d713ca2
5
5
  SHA512:
6
- metadata.gz: 9f22dfe25d16e814e04ab07efbfc87c97ae0e62c964ec75daf9f7e9b318a04a96a6f8aec53b1727b85ecda6c82ea066100a09735b0fd932491a2642ddfde2007
7
- data.tar.gz: f46f364c1288f5fe3d104210838bb68d033ed59b536d3d8ddb1d36d401e858f8204c85f94cea5561a75a1ffc05f002c33bc0cb46e1beea179e44d00614ec0e35
6
+ metadata.gz: 8108ee045d1f4f1bdc722a1b56a6518a32edebd18ae028dfdcd793a62044cc8bea509e06d8399179c11b1eeef971dd980d1bcd18654cd3a375af99b641c514a6
7
+ data.tar.gz: 395bcb27503bb30734d0d6e16fa8e95b33fa072dfaca56c2b9b06d33859c4cb915f766923a492cbdaae5407e45db2c76ee074a6c6d7aac245b822332f44d082c
@@ -18,11 +18,16 @@ jobs:
18
18
 
19
19
  runs-on: ubuntu-24.04
20
20
 
21
+ continue-on-error: ${{ matrix.continue-on-error }}
22
+
21
23
  strategy:
22
- fail-fast: true
24
+ fail-fast: false
23
25
  matrix:
24
26
  ruby: ["3.4", "3.5"]
25
27
  continue-on-error: [false]
28
+ include:
29
+ - ruby: "4.0"
30
+ continue-on-error: true
26
31
 
27
32
  services:
28
33
  redis:
data/CHANGELOG.rst CHANGED
@@ -7,6 +7,150 @@ The format is based on `Keep a Changelog <https://keepachangelog.com/en/1.1.0/>`
7
7
 
8
8
  <!--scriv-insert-here-->
9
9
 
10
+ .. _changelog-2.7.0:
11
+
12
+ 2.7.0 — 2026-05-13
13
+ ==================
14
+
15
+ Added
16
+ -----
17
+
18
+ - New ``housekeeping`` feature for ``Familia::Horreum``: a declarative DSL
19
+ (``chore :name do |obj| ... end``) for registering named cleanup blocks on
20
+ a model class, plus an instance method ``tidy!`` that runs all (or one)
21
+ registered chore against a single object. The feature owns registration
22
+ and per-instance execution only -- iteration, batching, scheduling and
23
+ error aggregation are the consumer application's responsibility, keeping
24
+ it distinct from ``Familia::Migration`` (which is for versioned, one-shot
25
+ transformations). Resolves #258.
26
+
27
+ Documentation
28
+ -------------
29
+
30
+ - Added ``docs/guides/feature-housekeeping.md`` covering the API, the
31
+ ``housekeeping`` vs ``migration`` vs defensive-setter trade-off,
32
+ generated method reference, design constraints, and common patterns
33
+ (multiple chores, sequential steps in one chore, tracking modified
34
+ records, error aggregation).
35
+
36
+ AI Assistance
37
+ -------------
38
+
39
+ - Drafted the housekeeping feature module, the tryouts test suite, and the
40
+ guide using Claude Code, working from the API proposal in issue #258 and
41
+ the existing ``feature-relationships.md`` and ``safe_dump.rb`` as style
42
+ templates.
43
+
44
+ .. _changelog-2.6.0:
45
+
46
+ 2.6.0 — 2026-04-17
47
+ ==================
48
+
49
+ Added
50
+ -----
51
+
52
+ - ``audit_multi_indexes`` detects drift in class-level multi-indexes via a
53
+ three-phase sweep (stale members, missing live objects, orphaned buckets).
54
+ Instance-scoped indexes (``within:``) return ``:not_implemented``. PR #221
55
+
56
+ - ``audit_related_fields`` SCANs for instance-level collection keys
57
+ (``list``, ``set``, ``zset``, ``hashkey``) whose parent hash no longer
58
+ exists -- typically left behind by interrupted ``destroy!`` calls or
59
+ external key mutation. Class-level related fields are skipped. PR #221
60
+
61
+ - ``audit_cross_references`` walks live identifiers against class-level
62
+ unique indexes to surface drift modes per-registry audits miss:
63
+ ``in_instances_missing_unique_index`` and
64
+ ``index_points_to_wrong_identifier`` (split-brain). PR #221
65
+
66
+ - ``repair_related_fields!`` class method DELs orphaned collection keys
67
+ from an audit result and returns ``{removed_keys:, failed_keys:,
68
+ status:}``. ``repair_all!`` gains opt-in ``audit_collections:`` and
69
+ ``check_cross_refs:`` kwargs (both default ``false``); only
70
+ ``related_fields`` is auto-repaired, cross-reference drift is
71
+ reported for manual resolution. PR #221
72
+
73
+ - ``Familia::AtomicOperations`` module exposing ``atomic_swap`` and
74
+ ``build_temp_key`` as reusable primitives for rebuild-then-swap
75
+ workflows (relies on native ``RENAME`` atomicity). PR #221
76
+
77
+ - ``Horreum#atomic_write(&block)`` wraps scalar persistence and
78
+ collection mutations in a single MULTI/EXEC. Unlike
79
+ ``save_with_collections``, failures roll back scalars too. All
80
+ participating DataTypes must share ``logical_database``; mismatches
81
+ raise ``Familia::CrossDatabaseError``. (#220)
82
+
83
+ Changed
84
+ -------
85
+
86
+ - ``health_check`` accepts new opt-in kwargs ``audit_collections:`` and
87
+ ``check_cross_refs:`` (both default ``false``). When omitted, the
88
+ corresponding report dimensions are ``nil`` and ``complete?`` returns
89
+ ``false`` until opted in. PR #221
90
+
91
+ - ``atomic_swap`` and ``build_temp_key`` relocated from
92
+ ``Indexing::RebuildStrategies`` to ``Familia::AtomicOperations``.
93
+ Internal callers delegate through; downstream direct callers should
94
+ switch. Semantics preserved verbatim from PR #247. PR #221
95
+
96
+ - ``health_check`` now reuses a single ``scan_identifiers`` +
97
+ ``load_multi`` pass across unique- and multi-index audits, reducing
98
+ SCANs from ``1 + N + M`` to ``2`` regardless of declared indexes.
99
+ Behavior and return shapes unchanged. PR #221
100
+
101
+ - Audit methods pipeline batched Redis calls: ``audit_cross_references``
102
+ uses HMGET per batch instead of per-object HGET;
103
+ ``discover_multi_index_buckets`` and ``audit_single_related_field``
104
+ batch SCAN results in slices of 100 inside ``pipelined`` blocks,
105
+ collapsing M round trips to ~M/100. PR #221
106
+
107
+ Fixed
108
+ -----
109
+
110
+ - ``AuditReport#healthy?`` now considers multi-index ``missing`` entries;
111
+ ``to_h`` / ``to_s`` include the ``missing`` count. Previously a report
112
+ could show ``issues_found`` while ``healthy?`` returned true. PR #221
113
+
114
+ - ``atomic_write`` cross-database guard no longer false-positives when a
115
+ Horreum inherits its ``logical_database`` and a related field explicitly
116
+ sets ``logical_database: 0``. Both sides now resolve to concrete
117
+ integers before comparison. (#220)
118
+
119
+ - ``atomic_write`` same-instance re-entrancy guard now uses a module-level
120
+ ``Mutex`` to serialise the ``@atomic_write_owner`` check-then-set,
121
+ closing a narrow race between concurrent entries. (#220)
122
+
123
+ - ``atomic_write`` clears the dirty flag only when
124
+ ``MultiResult.successful?`` is true. Previously transactions whose
125
+ individual commands returned exception objects (MULTI swallows these)
126
+ could leave the object marked clean. (#220)
127
+
128
+ - ``Horreum.scan_pattern``, ``discover_multi_index_buckets``, and
129
+ ``audit_instance_participations`` now respect ``Familia.delim`` instead
130
+ of hardcoding ``:``. Under a custom delim, every audit grounded in
131
+ these SCANs (instances, unique, multi, cross-references, participations)
132
+ silently saw zero keys and reported clean. PR #221
133
+
134
+ AI Assistance
135
+ -------------
136
+
137
+ - Implementation and test coverage for the new audit dimensions
138
+ (``audit_multi_indexes``, ``audit_related_fields``,
139
+ ``audit_cross_references``, ``repair_related_fields!``), the
140
+ ``AuditReport`` extensions, the ``healthy?``/``to_h``/``to_s`` fix, the
141
+ ``Familia::AtomicOperations`` extraction, the ``health_check`` caching
142
+ refactor, and the four audit performance/correctness fixes
143
+ (delimiter-aware SCAN, batched HMGET, pipelined SMEMBERS, pipelined
144
+ EXISTS) were authored with AI assistance. PR #221
145
+
146
+ - ``atomic_write`` design, implementation, tests, and review were
147
+ coordinated across Claude Code agents (``feature-dev:code-architect``,
148
+ ``backend-dev``, ``qa-automation-engineer``,
149
+ ``feature-dev:code-reviewer``). The reviewer caught a silent-corruption
150
+ gap in the cross-database guard; follow-up fixes (false-positive guard,
151
+ re-entrancy race, MultiResult success semantics) were surfaced by
152
+ ``gemini-code-assist`` and verified by the QA and reviewer agents. (#220)
153
+
10
154
  .. _changelog-2.5.0:
11
155
 
12
156
  2.5.0 — 2026-04-17
data/CLAUDE.md CHANGED
@@ -250,6 +250,24 @@ plan.save # If this raises, features are already mutated
250
250
 
251
251
  **Cross-database limitation**: MULTI/EXEC transactions only work within a single Redis database number. If scalar fields and a collection use different `logical_database` values, they cannot share a transaction. The `save_with_collections` pattern handles this by sequencing the operations rather than wrapping them in MULTI.
252
252
 
253
+ **Atomic pattern -- scalars and collections in one MULTI/EXEC:**
254
+ ```ruby
255
+ plan.atomic_write do
256
+ plan.name = "Premium" # Deferred: queued as HMSET by persist_to_storage
257
+ plan.features.clear # Immediate: queued as DEL in the open MULTI
258
+ plan.features.add("sso") # Immediate: queued as SADD in the open MULTI
259
+ end
260
+ # Block body runs first, then persist_to_storage queues HMSET + index updates
261
+ # + touch_instances!, then EXEC fires. All-or-nothing.
262
+ ```
263
+
264
+ Unlike `save_with_collections` (which sequences two separate writes and cannot roll back scalars if a collection operation fails), `atomic_write` composes the existing `transaction` infrastructure so every command lands in one MULTI/EXEC. Collection mutations auto-route into the open transaction because `DataType#dbclient` honours `Fiber[:familia_transaction]`.
265
+
266
+ Constraints:
267
+ - All related DataTypes (instance-level and class-level) must share the parent Horreum's `logical_database`. A mismatch raises `Familia::CrossDatabaseError` -- fall back to `save_with_collections` in that case.
268
+ - Cannot nest inside another `transaction` or `atomic_write` (raises `Familia::OperationModeError`).
269
+ - Collection method return values inside the block are `Redis::Future` objects (inherent to MULTI) -- do not inspect them before EXEC.
270
+
253
271
  **Instances timeline**: The class-level `instances` sorted set is a timeline of last-modified times, not a registry. `persist_to_storage` (called by `save`) and `commit_fields`/`batch_update` all call `touch_instances!` to update the timestamp. Use `in_instances?(identifier)` for fast O(log N) checks without loading the object.
254
272
 
255
273
  ### Instances Timeline Lifecycle
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- familia (2.5.0)
4
+ familia (2.7.0)
5
5
  concurrent-ruby (~> 1.3)
6
6
  connection_pool (>= 2.4, < 4.0)
7
7
  csv (~> 3.3)
@@ -81,7 +81,7 @@ GEM
81
81
  logger (1.7.0)
82
82
  mcp (0.8.0)
83
83
  json-schema (>= 4.1)
84
- minitest (5.26.0)
84
+ minitest (5.27.0)
85
85
  oj (3.16.13)
86
86
  bigdecimal (>= 3.0)
87
87
  ostruct (>= 0.2)
@@ -188,7 +188,7 @@ GEM
188
188
  tty-screen (0.8.2)
189
189
  unicode-display_width (3.2.0)
190
190
  unicode-emoji (~> 4.1)
191
- unicode-emoji (4.1.0)
191
+ unicode-emoji (4.2.0)
192
192
  uri-valkey (1.4.0)
193
193
  yard (0.9.37)
194
194
  zeitwerk (2.7.3)
@@ -0,0 +1,46 @@
1
+ Added
2
+ ~~~~~
3
+
4
+ - Instance-scoped ``audit_multi_indexes`` is now fully implemented.
5
+ Discovers per-scope bucket keys via SCAN, partitions them by scope
6
+ instance, and reports stale members, orphaned buckets, and missing
7
+ entries in the same shape as the class-level audit. Orphan entries
8
+ carry a ``:reason`` (``:scope_missing`` or ``:field_value_unheld``)
9
+ and a ``:scope_id``. Missing entries are detected via the indexed
10
+ class's ``participates_in`` relationship to the scope class; when
11
+ absent, the result carries ``missing_status: :not_audited``.
12
+ Resolves the ``:not_implemented`` follow-up from #217.
13
+
14
+ - ``repair_multi_indexes!`` class method that invokes the existing
15
+ ``rebuild_<index_name>`` methods for both class-level (one call on
16
+ the indexed class) and instance-scoped (one call per scope
17
+ instance) multi-indexes. Indexes whose audit status is ``:ok`` are
18
+ skipped; rebuild methods that don't exist or scope classes
19
+ without an ``instances`` collection are recorded in ``:skipped``
20
+ with a reason.
21
+
22
+ Changed
23
+ ~~~~~~~
24
+
25
+ - ``repair_all!`` now runs each repair stage inside its own rescue
26
+ boundary; a failure in one dimension no longer prevents the others
27
+ from running. The return hash gains ``:status`` (``:ok`` or
28
+ ``:partial_failure``), ``:errors`` (per-stage exception details
29
+ when raised), and ``:multi_indexes`` (results from the new
30
+ ``repair_multi_indexes!``). An opt-in ``verify: true`` kwarg
31
+ re-runs ``health_check`` after repair and exposes the result as
32
+ ``:post_audit`` / ``:verified`` so callers can confirm the run
33
+ actually drove the model back to a healthy state.
34
+
35
+ - ``AuditReport#complete?`` is no longer false-positive due to
36
+ ``:not_implemented`` stubs in ``multi_indexes`` -- instance-scoped
37
+ indexes return ``:ok`` or ``:issues_found`` like class-level ones.
38
+
39
+ AI Assistance
40
+ ~~~~~~~~~~~~~
41
+
42
+ - Instance-scoped multi-index audit algorithm (bucket discovery,
43
+ scope existence batching, participation-driven missing detection),
44
+ ``repair_multi_indexes!``, the ``repair_all!`` robustness
45
+ refactor, and the accompanying tryouts coverage were authored
46
+ with Claude Code assistance against the #217 review branch.
@@ -0,0 +1,217 @@
1
+ # Housekeeping Feature Guide
2
+
3
+ The Housekeeping feature provides a declarative DSL for registering named cleanup chores on Horreum models. It is designed for short-lived, repeated tidying against fields whose values have drifted over time -- not for versioned, one-shot migrations.
4
+
5
+ > [!TIP]
6
+ > Enable with `feature :housekeeping` and register cleanup blocks with `chore :name do |obj| ... end`. Run them with `obj.tidy!`. Iteration and persistence are the caller's responsibility.
7
+
8
+ ## Quick Start
9
+
10
+ ```ruby
11
+ class Organization < Familia::Horreum
12
+ feature :housekeeping
13
+
14
+ field :planid
15
+
16
+ chore :standardize_planid do |org|
17
+ canonical = case org.planid
18
+ when "pro", "Pro", "professional_v1" then "professional"
19
+ when "free", "Free", "basic" then "free"
20
+ end
21
+ if canonical && canonical != org.planid
22
+ org.planid = canonical
23
+ org.save
24
+ true
25
+ end
26
+ end
27
+ end
28
+
29
+ org = Organization.from_identifier("acme-corp")
30
+ org.tidy!
31
+ # => { standardize_planid: true }
32
+ ```
33
+
34
+ ## When to Use
35
+
36
+ | Tool | Use When |
37
+ |------|----------|
38
+ | `Familia::Migration::Base` | Versioned, one-shot transformation tracked across releases |
39
+ | `feature :housekeeping` | Short-lived chore run nightly until data is clean, then removed |
40
+ | Defensive code in setters | Permanent invariant enforced on every write |
41
+
42
+ Housekeeping fills the gap between migrations (heavy, tracked) and inline coercion (permanent). Register a chore, run it on a schedule for a few days, verify clean data, then delete the chore and the defensive code that handled the messy values.
43
+
44
+ ## Core Capabilities
45
+
46
+ ### Registration -- Class-Level DSL
47
+
48
+ Each chore is a named block bound to the model class:
49
+
50
+ ```ruby
51
+ class User < Familia::Horreum
52
+ feature :housekeeping
53
+
54
+ field :email, :timezone
55
+
56
+ chore :downcase_email do |user|
57
+ next unless user.email && user.email != user.email.downcase
58
+ user.email = user.email.downcase
59
+ user.save
60
+ true
61
+ end
62
+
63
+ chore :default_timezone do |user|
64
+ next if user.timezone
65
+ user.timezone = "UTC"
66
+ user.save
67
+ true
68
+ end
69
+ end
70
+
71
+ User.chores.keys
72
+ # => [:downcase_email, :default_timezone]
73
+ ```
74
+
75
+ ### Execution -- Single Instance
76
+
77
+ Run all registered chores, or one by name:
78
+
79
+ ```ruby
80
+ user = User.from_identifier("alice@example.com")
81
+
82
+ user.tidy!
83
+ # => { downcase_email: true, default_timezone: nil }
84
+
85
+ user.tidy!(:downcase_email)
86
+ # => { downcase_email: true }
87
+ ```
88
+
89
+ The return value is a hash mapping chore name to the block's return value. A truthy result signals "modified"; `nil` or `false` signals "no-op". The feature does not interpret these values -- they are passed through for the caller's stats collection.
90
+
91
+ ### Iteration -- Caller's Responsibility
92
+
93
+ The feature operates on a single instance. Bulk runs live in the consumer app:
94
+
95
+ ```ruby
96
+ # nightly rake task
97
+ namespace :data do
98
+ task tidy_orgs: :environment do
99
+ stats = Hash.new(0)
100
+ Organization.instances.each do |id|
101
+ org = Organization.find_by_id(id) or next
102
+ results = org.tidy!
103
+ results.each { |name, result| stats[name] += 1 if result }
104
+ end
105
+ puts stats.inspect
106
+ end
107
+ end
108
+ ```
109
+
110
+ The feature has no opinion about batching, SCAN vs KEYS, error aggregation, or scheduling -- the consumer app owns all of that.
111
+
112
+ ## Generated Method Reference
113
+
114
+ ### When a class declares `feature :housekeeping`
115
+
116
+ | Class | Method | Purpose |
117
+ |-------|--------|---------|
118
+ | **Class** | `chore(name, &block)` | Register a chore |
119
+ | | `chores` | Hash of registered chores |
120
+ | **Instance** | `tidy!(name = nil)` | Run all (or one) chore; returns Hash |
121
+
122
+ ## Design Constraints
123
+
124
+ 1. **No implicit saves.** The block must call `save` (or `commit_fields`) itself. The feature does not auto-persist.
125
+ 2. **No iteration.** Operates on a single instance. There is no class-level `tidy_all!`.
126
+ 3. **No ordering.** Chores run in registration order, but should not depend on each other. If order matters, write one chore with sequential steps.
127
+ 4. **Idempotent by convention.** Use the conditional pattern (`if canonical && canonical != org.planid`) so a second run is a no-op.
128
+ 5. **Errors propagate.** The block can raise; the iteration code in the consumer app decides whether to rescue.
129
+
130
+ ## Common Patterns
131
+
132
+ ### Multiple Independent Chores
133
+
134
+ ```ruby
135
+ class Customer < Familia::Horreum
136
+ feature :housekeeping
137
+
138
+ chore :trim_whitespace do |c|
139
+ next unless c.name && c.name != c.name.strip
140
+ c.name = c.name.strip
141
+ c.save
142
+ true
143
+ end
144
+
145
+ chore :uppercase_country do |c|
146
+ next unless c.country && c.country != c.country.upcase
147
+ c.country = c.country.upcase
148
+ c.save
149
+ true
150
+ end
151
+ end
152
+
153
+ customer.tidy!
154
+ # => { trim_whitespace: true, uppercase_country: nil }
155
+ ```
156
+
157
+ ### Sequential Steps in One Chore
158
+
159
+ When step B depends on step A's result, keep them in one block:
160
+
161
+ ```ruby
162
+ chore :reconcile_billing do |account|
163
+ changed = false
164
+ if account.plan_id == "legacy"
165
+ account.plan_id = "standard"
166
+ changed = true
167
+ end
168
+ if account.plan_id == "standard" && account.billing_cycle.nil?
169
+ account.billing_cycle = "monthly"
170
+ changed = true
171
+ end
172
+ if changed
173
+ account.save
174
+ true
175
+ end
176
+ end
177
+ ```
178
+
179
+ ### Tracking Modified Records
180
+
181
+ ```ruby
182
+ modified = []
183
+ Organization.instances.each do |id|
184
+ org = Organization.find_by_id(id) or next
185
+ results = org.tidy!
186
+ modified << id if results.values.any?
187
+ end
188
+ puts "Modified #{modified.size} records: #{modified.inspect}"
189
+ ```
190
+
191
+ ### Error Aggregation
192
+
193
+ ```ruby
194
+ errors = {}
195
+ Organization.instances.each do |id|
196
+ org = Organization.find_by_id(id) or next
197
+ begin
198
+ org.tidy!
199
+ rescue => e
200
+ errors[id] = e.message
201
+ end
202
+ end
203
+ ```
204
+
205
+ ## Best Practices
206
+
207
+ 1. **Keep chores short-lived.** Delete the registration once data is clean.
208
+ 2. **Use `||=` and conditional checks** so a second run is a no-op.
209
+ 3. **Save inside the block** -- the feature does not persist for you.
210
+ 4. **Return truthy on modification, nil on no-op** so callers can collect stats.
211
+ 5. **Prefer migrations for one-shot, versioned transformations.** Use housekeeping for ongoing tidying that can be run repeatedly.
212
+
213
+ ## See Also
214
+
215
+ - [**Writing Migrations**](writing-migrations.md) - Versioned, one-shot data transformations
216
+ - [**Field System**](field-system.md) - How field values are stored and serialized
217
+ - [**Feature System**](feature-system.md) - How features are mixed into Horreum classes
data/docs/guides/index.md CHANGED
@@ -37,9 +37,13 @@ Welcome to the comprehensive documentation for Familia v2.0. This guide collecti
37
37
  13. **[Quantization](feature-quantization.md)** - Time-based data bucketing for analytics
38
38
  14. **[Time Literals](time-literals.md)** - Time manipulation and formatting utilities
39
39
 
40
+ ### 🧹 Data Maintenance
41
+
42
+ 15. **[Housekeeping](feature-housekeeping.md)** - Declarative cleanup chores for drifted field values
43
+
40
44
  ### 🛠️ Implementation & Usage
41
45
 
42
- 15. **[Optimized Loading](optimized-loading.md)** - Reduce Redis commands by 50-96% for bulk object loading _(new!)_
46
+ 16. **[Optimized Loading](optimized-loading.md)** - Reduce Redis commands by 50-96% for bulk object loading _(new!)_
43
47
 
44
48
 
45
49
  ## 🚀 Quick Start Examples
@@ -0,0 +1,86 @@
1
+ # lib/familia/atomic_operations.rb
2
+ #
3
+ # frozen_string_literal: true
4
+
5
+ # Familia
6
+ #
7
+ # A family warehouse for your keystore data.
8
+ #
9
+ module Familia
10
+ # AtomicOperations provides Redis utilities for atomic, zero-downtime data
11
+ # replacement. These primitives are datastore-level building blocks shared
12
+ # across index rebuilds, audit/repair routines, and any other code that
13
+ # needs to swap a key's contents without exposing a transient empty state.
14
+ #
15
+ # The canonical pattern:
16
+ # 1. Build replacement contents in a temporary key (see {.build_temp_key}).
17
+ # 2. Atomically swap it into place with {.atomic_swap}.
18
+ #
19
+ # All methods are module_function-style; call them directly on the module.
20
+ #
21
+ # @example Atomic index rebuild
22
+ # temp_key = Familia::AtomicOperations.build_temp_key(final_key)
23
+ # # ... populate temp_key ...
24
+ # Familia::AtomicOperations.atomic_swap(temp_key, final_key, redis)
25
+ #
26
+ module AtomicOperations
27
+ # Builds a temporary key name for atomic swaps
28
+ #
29
+ # @param base_key [String] The final index key
30
+ # @return [String] Temporary key with timestamp suffix
31
+ #
32
+ def self.build_temp_key(base_key)
33
+ timestamp = Familia.now.to_i
34
+ "#{base_key}:rebuild:#{timestamp}"
35
+ end
36
+
37
+ # Performs atomic swap of temp key to final key.
38
+ #
39
+ # Non-empty rebuilds use Redis RENAME (>= 2.6), which atomically
40
+ # replaces final_key if it exists. Readers observe either the old
41
+ # index or the new one; there is no window in which final_key is
42
+ # absent. This avoids the partial-update, race-condition, and
43
+ # stale-visibility problems of a two-step DEL+RENAME sequence.
44
+ #
45
+ # Empty rebuilds (no temp key) intentionally DEL final_key so the
46
+ # live index reflects the empty result set. In that branch readers
47
+ # can observe final_key as absent -- this is the correct outcome for
48
+ # an index with zero members, not a transient gap.
49
+ #
50
+ # @param temp_key [String] The temporary key containing rebuilt index
51
+ # @param final_key [String] The live index key
52
+ # @param redis [Redis] The Redis connection
53
+ #
54
+ def self.atomic_swap(temp_key, final_key, redis)
55
+ # Check if temp key exists first - RENAME fails on non-existent keys.
56
+ # redis.exists returns Integer across all supported redis-rb versions;
57
+ # using > 0 also tolerates a future boolean return without breaking.
58
+ unless redis.exists(temp_key) > 0
59
+ Familia.info "[AtomicOp] No temp key to swap (empty result set)"
60
+ # Empty rebuild: remove the live index so reads reflect zero members.
61
+ # This is the one path where readers can legitimately see final_key
62
+ # as absent -- the index genuinely has no entries.
63
+ redis.del(final_key)
64
+ return
65
+ end
66
+
67
+ # RENAME atomically replaces final_key if it exists (Redis >= 2.6),
68
+ # so readers never observe a missing final_key during a non-empty
69
+ # swap. A preceding DEL would open a gap where concurrent HGETs
70
+ # return nil.
71
+ redis.rename(temp_key, final_key)
72
+ Familia.info "[AtomicOp] Atomic swap completed: #{temp_key} -> #{final_key}"
73
+ rescue Redis::CommandError => e
74
+ # If temp key doesn't exist, just log and return (already handled above)
75
+ if e.message.include?("no such key")
76
+ Familia.info "[AtomicOp] Temp key vanished during swap (concurrent operation?)"
77
+ return
78
+ end
79
+
80
+ # For other errors, preserve temp key for debugging
81
+ Familia.warn "[AtomicOp] Atomic swap failed: #{e.message}"
82
+ Familia.warn "[AtomicOp] Temp key preserved for debugging: #{temp_key}"
83
+ raise
84
+ end
85
+ end
86
+ end
@@ -150,6 +150,10 @@ module Familia
150
150
  # @raise [Familia::Problem] if Familia.strict_write_order is true
151
151
  #
152
152
  def warn_if_dirty!
153
+ # Suppress warnings while parent is inside atomic_write — scalar setters in the block
154
+ # make the object dirty by design, so firing warnings for each collection call is noise.
155
+ return if @parent_ref.respond_to?(:atomic_write_mode?) && @parent_ref.atomic_write_mode?
156
+
153
157
  return unless @parent_ref.respond_to?(:dirty?) && @parent_ref.dirty?
154
158
 
155
159
  dirty = @parent_ref.dirty_fields
@@ -43,6 +43,27 @@ module Familia
43
43
  # inside a transaction or pipeline
44
44
  class NestedTransactionError < OperationModeError; end
45
45
 
46
+ # Raised when atomic_write cannot include all DataType fields because
47
+ # they span multiple Redis databases (MULTI/EXEC cannot cross databases).
48
+ class CrossDatabaseError < OperationModeError
49
+ attr_reader :field_name, :field_database, :horreum_database
50
+
51
+ def initialize(field_name, field_database, horreum_database)
52
+ @field_name = field_name
53
+ @field_database = field_database
54
+ @horreum_database = horreum_database
55
+ super(build_message)
56
+ end
57
+
58
+ private
59
+
60
+ def build_message
61
+ "Cannot include field #{field_name} (logical_database: #{field_database}) in " \
62
+ "atomic_write: parent Horreum uses logical_database #{horreum_database}. " \
63
+ "MULTI/EXEC cannot span multiple databases."
64
+ end
65
+ end
66
+
46
67
  # Raised when attempting to reference a field that doesn't exist
47
68
  class UnknownFieldError < HorreumError; end
48
69