e11y 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (70) hide show
  1. checksums.yaml +4 -4
  2. data/.rspec +1 -0
  3. data/.rubocop.yml +20 -0
  4. data/CHANGELOG.md +151 -13
  5. data/README.md +1138 -104
  6. data/RELEASE.md +254 -0
  7. data/Rakefile +377 -0
  8. data/benchmarks/OPTIMIZATION.md +246 -0
  9. data/benchmarks/README.md +103 -0
  10. data/benchmarks/allocation_profiling.rb +253 -0
  11. data/benchmarks/e11y_benchmarks.rb +447 -0
  12. data/benchmarks/ruby_baseline_allocations.rb +175 -0
  13. data/benchmarks/run_all.rb +9 -21
  14. data/docs/00-ICP-AND-TIMELINE.md +2 -2
  15. data/docs/ADR-001-architecture.md +1 -1
  16. data/docs/ADR-004-adapter-architecture.md +247 -0
  17. data/docs/ADR-009-cost-optimization.md +231 -115
  18. data/docs/ADR-017-multi-rails-compatibility.md +103 -0
  19. data/docs/ADR-INDEX.md +99 -0
  20. data/docs/CONTRIBUTING.md +312 -0
  21. data/docs/IMPLEMENTATION_PLAN.md +1 -1
  22. data/docs/QUICK-START.md +0 -6
  23. data/docs/use_cases/UC-019-retention-based-routing.md +584 -0
  24. data/e11y.gemspec +28 -17
  25. data/lib/e11y/adapters/adaptive_batcher.rb +3 -0
  26. data/lib/e11y/adapters/audit_encrypted.rb +10 -4
  27. data/lib/e11y/adapters/base.rb +15 -0
  28. data/lib/e11y/adapters/file.rb +4 -1
  29. data/lib/e11y/adapters/in_memory.rb +6 -0
  30. data/lib/e11y/adapters/loki.rb +9 -0
  31. data/lib/e11y/adapters/otel_logs.rb +11 -9
  32. data/lib/e11y/adapters/sentry.rb +9 -0
  33. data/lib/e11y/adapters/yabeda.rb +54 -10
  34. data/lib/e11y/buffers.rb +8 -8
  35. data/lib/e11y/console.rb +52 -60
  36. data/lib/e11y/event/base.rb +75 -10
  37. data/lib/e11y/event/value_sampling_config.rb +10 -4
  38. data/lib/e11y/events/rails/http/request.rb +1 -1
  39. data/lib/e11y/instruments/active_job.rb +6 -3
  40. data/lib/e11y/instruments/rails_instrumentation.rb +51 -28
  41. data/lib/e11y/instruments/sidekiq.rb +7 -7
  42. data/lib/e11y/logger/bridge.rb +24 -54
  43. data/lib/e11y/metrics/cardinality_protection.rb +257 -12
  44. data/lib/e11y/metrics/cardinality_tracker.rb +17 -0
  45. data/lib/e11y/metrics/registry.rb +6 -2
  46. data/lib/e11y/metrics/relabeling.rb +0 -56
  47. data/lib/e11y/metrics.rb +6 -1
  48. data/lib/e11y/middleware/audit_signing.rb +12 -9
  49. data/lib/e11y/middleware/pii_filter.rb +18 -10
  50. data/lib/e11y/middleware/request.rb +10 -4
  51. data/lib/e11y/middleware/routing.rb +117 -90
  52. data/lib/e11y/middleware/sampling.rb +47 -28
  53. data/lib/e11y/middleware/trace_context.rb +40 -11
  54. data/lib/e11y/middleware/validation.rb +20 -2
  55. data/lib/e11y/middleware/versioning.rb +1 -1
  56. data/lib/e11y/pii.rb +7 -7
  57. data/lib/e11y/railtie.rb +24 -20
  58. data/lib/e11y/reliability/circuit_breaker.rb +3 -0
  59. data/lib/e11y/reliability/dlq/file_storage.rb +16 -5
  60. data/lib/e11y/reliability/dlq/filter.rb +3 -0
  61. data/lib/e11y/reliability/retry_handler.rb +4 -0
  62. data/lib/e11y/sampling/error_spike_detector.rb +16 -5
  63. data/lib/e11y/sampling/load_monitor.rb +13 -4
  64. data/lib/e11y/self_monitoring/reliability_monitor.rb +3 -0
  65. data/lib/e11y/version.rb +1 -1
  66. data/lib/e11y.rb +86 -9
  67. metadata +83 -38
  68. data/docs/use_cases/UC-019-tiered-storage-migration.md +0 -562
  69. data/lib/e11y/middleware/pii_filtering.rb +0 -280
  70. data/lib/e11y/middleware/slo.rb +0 -168
@@ -2380,6 +2380,253 @@ end
2380
2380
 
2381
2381
  ---
2382
2382
 
2383
+ ## 14. Retention-Based Routing (Phase 5 Extension)
2384
+
2385
+ **Status:** ✅ Proposed (2026-01-21)
2386
+ **Covers:** UC-019 (Retention-Based Event Routing), ADR-009 (Cost Optimization)
2387
+ **Replaces:** TieredStorage adapter (routing handles all use cases)
2388
+
2389
+ ### 14.1. Problem Statement
2390
+
2391
+ Events have different retention requirements, but no automatic routing:
2392
+
2393
+ - **Audit logs:** 7 years (GDPR, SOX)
2394
+ - **Business events:** 90 days (analytics)
2395
+ - **Debug logs:** 7 days (troubleshooting)
2396
+ - **Metrics:** 30 days (monitoring)
2397
+
2398
+ **Current Issues:**
2399
+ 1. Manual adapter selection per event
2400
+ 2. No enforcement of retention policies
2401
+ 3. Cost inefficiency (debug logs stored for years)
2402
+ 4. Compliance risk (audit logs deleted early)
2403
+
2404
+ ### 14.2. Solution: Declarative Retention + Lambda Routing
2405
+
2406
+ ```ruby
2407
+ # Event declares retention intent
2408
+ class UserDeletedEvent < E11y::Event::Base
2409
+ audit_event true
2410
+ retention_period 7.years # ← Declarative
2411
+ end
2412
+
2413
+ # Config defines routing logic
2414
+ E11y.configure do |config|
2415
+ config.routing_rules = [
2416
+ ->(event) { :audit_encrypted if event[:audit_event] },
2417
+ ->(event) {
2418
+ days = (Time.parse(event[:retention_until]) - Time.now) / 86400
2419
+ days > 90 ? :s3_glacier : :loki
2420
+ }
2421
+ ]
2422
+ end
2423
+ ```
2424
+
2425
+ ### 14.3. Event::Base Extension
2426
+
2427
+ ```ruby
2428
+ module E11y
2429
+ module Event
2430
+ class Base
2431
+ # DSL: declare retention period
2432
+ def retention_period(value = nil)
2433
+ @retention_period = value if value
2434
+ @retention_period ||
2435
+ superclass.retention_period ||
2436
+ E11y.configuration.default_retention_period ||
2437
+ 30.days
2438
+ end
2439
+
2440
+ def track(**payload)
2441
+ event_hash = {
2442
+ # ... existing fields ...
2443
+ retention_until: (Time.now + retention_period).iso8601, # ← Auto-calculated
2444
+ audit_event: audit_event? # ← For routing
2445
+ }
2446
+
2447
+ E11y::Pipeline.process(event_hash)
2448
+ end
2449
+ end
2450
+ end
2451
+ end
2452
+ ```
2453
+
2454
+ ### 14.4. Routing Middleware
2455
+
2456
+ ```ruby
2457
+ module E11y
2458
+ module Middleware
2459
+ class Routing < Base
2460
+ def initialize(rules: [], fallback_adapters: [])
2461
+ @rules = rules
2462
+ @fallback_adapters = fallback_adapters
2463
+ end
2464
+
2465
+ def call(event_hash)
2466
+ # 1. Explicit adapters bypass routing
2467
+ target_adapters = if event_hash[:adapters]&.any?
2468
+ event_hash[:adapters]
2469
+ else
2470
+ # 2. Apply routing rules (lambdas)
2471
+ apply_routing_rules(event_hash)
2472
+ end
2473
+
2474
+ # 3. Write to selected adapters
2475
+ target_adapters.each do |adapter_name|
2476
+ adapter = E11y.configuration.adapters[adapter_name]
2477
+ adapter&.write(event_hash)
2478
+ end
2479
+
2480
+ next_middleware&.call(event_hash)
2481
+ end
2482
+
2483
+ private
2484
+
2485
+ def apply_routing_rules(event_hash)
2486
+ matched = []
2487
+ @rules.each do |rule|
2488
+ result = rule.call(event_hash)
2489
+ matched.concat(Array(result)) if result
2490
+ end
2491
+ matched.any? ? matched.uniq : @fallback_adapters
2492
+ end
2493
+ end
2494
+ end
2495
+ end
2496
+ ```
2497
+
2498
+ ### 14.5. Configuration
2499
+
2500
+ ```ruby
2501
+ E11y.configure do |config|
2502
+ config.default_retention_period = 30.days
2503
+
2504
+ config.routing_rules = [
2505
+ # Rule 1: Audit → encrypted storage
2506
+ ->(event) { :audit_encrypted if event[:audit_event] },
2507
+
2508
+ # Rule 2: Long retention → cold storage
2509
+ ->(event) {
2510
+ days = (Time.parse(event[:retention_until]) - Time.now) / 86400
2511
+ :s3_glacier if days > 90
2512
+ },
2513
+
2514
+ # Rule 3: Medium retention → warm storage
2515
+ ->(event) {
2516
+ days = (Time.parse(event[:retention_until]) - Time.now) / 86400
2517
+ :s3_standard if days.between?(30, 90)
2518
+ },
2519
+
2520
+ # Rule 4: Short retention → hot storage
2521
+ ->(event) {
2522
+ days = (Time.parse(event[:retention_until]) - Time.now) / 86400
2523
+ :loki if days <= 30
2524
+ },
2525
+
2526
+ # Rule 5: Errors → also to Sentry
2527
+ ->(event) { :sentry if event[:severity] == :error }
2528
+ ]
2529
+
2530
+ config.fallback_adapters = [:stdout]
2531
+ end
2532
+ ```
2533
+
2534
+ ### 14.6. Usage Examples
2535
+
2536
+ ```ruby
2537
+ # Default retention (30 days from config)
2538
+ class OrderEvent < E11y::Event::Base
2539
+ schema { required(:order_id).filled(:integer) }
2540
+ end
2541
+
2542
+ # Explicit retention (7 years audit)
2543
+ class UserDeletedEvent < E11y::Event::Base
2544
+ audit_event true
2545
+ retention_period 7.years
2546
+ end
2547
+
2548
+ # Short retention (7 days debug)
2549
+ class DebugEvent < E11y::Event::Base
2550
+ retention_period 7.days
2551
+ end
2552
+
2553
+ # Explicit adapters (bypass routing)
2554
+ class PaymentEvent < E11y::Event::Base
2555
+ retention_period 1.year
2556
+ adapters :audit_encrypted, :loki # Explicit
2557
+ end
2558
+ ```
2559
+
2560
+ ### 14.7. Cost Optimization
2561
+
2562
+ **Before:** All events → Loki (30 days default)
2563
+ - Debug logs: $500/month
2564
+ - Total: $500/month
2565
+
2566
+ **After:** Automatic routing
2567
+ - Debug logs (7 days) → stdout: $0/month
2568
+ - Business (30 days) → Loki: $100/month
2569
+ - Audit (7 years) → S3 Glacier: $5/month
2570
+ - **Total: $105/month (80% savings!)**
2571
+
2572
+ ### 14.8. Advantages vs TieredStorage
2573
+
2574
+ | Aspect | TieredStorage Adapter | Retention Routing |
2575
+ |--------|---------------------|------------------|
2576
+ | **Flexibility** | Fixed 3 tiers (hot/warm/cold) | Unlimited adapters via lambdas |
2577
+ | **Configuration** | Adapter-specific | Centralized rules |
2578
+ | **Migration** | Background jobs | No migration needed |
2579
+ | **Cost** | All events go through same adapter | Events routed to optimal storage |
2580
+ | **Compliance** | Manual tier selection | Automatic audit routing |
2581
+
2582
+ **Decision:** Remove TieredStorage, use Retention Routing instead.
2583
+
2584
+ ### 14.9. Migration Path
2585
+
2586
+ 1. ✅ Add `retention_period` DSL to Event::Base
2587
+ 2. ✅ Add `routing_rules` to Configuration
2588
+ 3. ✅ Implement Routing middleware
2589
+ 4. ✅ Update existing events:
2590
+ ```ruby
2591
+ # Before
2592
+ class OrderEvent < E11y::Event::Base
2593
+ adapters :loki
2594
+ end
2595
+
2596
+ # After
2597
+ class OrderEvent < E11y::Event::Base
2598
+ retention_period 30.days # Routing handles adapter selection
2599
+ end
2600
+ ```
2601
+ 5. ✅ Remove TieredStorage adapter + tests
2602
+
2603
+ ### 14.10. Testing
2604
+
2605
+ ```ruby
2606
+ RSpec.describe E11y::Middleware::Routing do
2607
+ it 'routes audit events to audit_encrypted' do
2608
+ event = { audit_event: true, retention_until: 7.years.from_now.iso8601 }
2609
+ expect(adapters[:audit_encrypted]).to receive(:write)
2610
+ routing.call(event)
2611
+ end
2612
+
2613
+ it 'routes short retention to loki' do
2614
+ event = { retention_until: 20.days.from_now.iso8601 }
2615
+ expect(adapters[:loki]).to receive(:write)
2616
+ routing.call(event)
2617
+ end
2618
+
2619
+ it 'respects explicit adapters' do
2620
+ event = { adapters: [:sentry], retention_until: 1.year.from_now.iso8601 }
2621
+ expect(adapters[:sentry]).to receive(:write)
2622
+ expect(adapters[:s3_glacier]).not_to receive(:write)
2623
+ routing.call(event)
2624
+ end
2625
+ end
2626
+ ```
2627
+
2628
+ ---
2629
+
2383
2630
  **Status:** ✅ Draft Complete (Updated to Unified DSL v1.1.0)
2384
2631
  **Next:** ADR-006 (Security & Compliance) or ADR-008 (Rails Integration)
2385
2632
  **Estimated Implementation:** 2 weeks
@@ -15,7 +15,7 @@
15
15
  - ✅ **Value-Based Sampling** (FEAT-4846) - DSL for sampling by payload values (>, <, ==, range)
16
16
  - ✅ **Stratified Sampling** (FEAT-4850, C11 resolution) - SLO-accurate sampling with correction
17
17
  - ⏳ **Compression** - Not started
18
- - **Tiered Storage** - Not started
18
+ - **Retention-Based Routing** (2026-01-21) - Replaces TieredStorage adapter
19
19
 
20
20
  ---
21
21
 
@@ -2023,163 +2023,279 @@ end
2023
2023
 
2024
2024
  ---
2025
2025
 
2026
- ## 6. Tiered Storage
2026
+ ## 6. Retention-Based Routing
2027
2027
 
2028
- ### 6.1. Retention Tagging
2028
+ **Status:** Proposed (Phase 5 Extension, 2026-01-21)
2029
+ **Replaces:** TieredStorage adapter (routing handles all use cases)
2030
+ **Related:** ADR-004 §14 (Retention-Based Routing), UC-019
2029
2031
 
2030
- **Design Decision:** E11y adds `retention_until` timestamp, downstream systems handle deletion.
2032
+ ### 6.1. Problem: Cost Inefficiency with Manual Adapter Selection
2031
2033
 
2034
+ **Current State:**
2032
2035
  ```ruby
2033
- # lib/e11y/cost/retention_tagger.rb
2036
+ # ❌ Developer must remember correct adapter for each event type
2037
+ class DebugEvent < E11y::Event::Base
2038
+ adapters :loki # Expensive! Stores debug for 30 days
2039
+ end
2040
+
2041
+ class AuditEvent < E11y::Event::Base
2042
+ adapters :audit_encrypted # Correct but easy to forget
2043
+ end
2044
+ ```
2045
+
2046
+ **Issues:**
2047
+ 1. ❌ No automatic cost optimization
2048
+ 2. ❌ Debug logs stored in expensive storage
2049
+ 3. ❌ Compliance risk (audit logs in wrong storage)
2050
+ 4. ❌ Manual adapter selection per event
2051
+
2052
+ **Cost Impact:**
2053
+ - Debug logs (short-term) in Loki: **$500/month**
2054
+ - No tiered storage → **wasted capacity**
2055
+
2056
+ ### 6.2. Solution: Declarative Retention + Lambda Routing
2057
+
2058
+ **Design Decision:** Events declare retention intent, routing middleware selects optimal adapters.
2059
+
2060
+ ```ruby
2061
+ # Event declares retention (declarative)
2062
+ class DebugEvent < E11y::Event::Base
2063
+ retention_period 7.days # ← Intent
2064
+ end
2065
+
2066
+ class AuditEvent < E11y::Event::Base
2067
+ audit_event true
2068
+ retention_period 7.years # ← Intent
2069
+ end
2070
+
2071
+ # Config defines routing logic (centralized)
2072
+ E11y.configure do |config|
2073
+ config.default_retention_period = 30.days
2074
+
2075
+ config.routing_rules = [
2076
+ # Rule 1: Audit → encrypted storage
2077
+ ->(event) { :audit_encrypted if event[:audit_event] },
2078
+
2079
+ # Rule 2: Long retention → cold storage
2080
+ ->(event) {
2081
+ days = (Time.parse(event[:retention_until]) - Time.now) / 86400
2082
+ :s3_glacier if days > 90
2083
+ },
2084
+
2085
+ # Rule 3: Short retention → hot storage
2086
+ ->(event) {
2087
+ days = (Time.parse(event[:retention_until]) - Time.now) / 86400
2088
+ :loki if days <= 30
2089
+ }
2090
+ ]
2091
+ end
2092
+ ```
2093
+
2094
+ **Result:**
2095
+ - ✅ **Automatic cost optimization** (routing handles adapter selection)
2096
+ - ✅ **Compliance enforcement** (audit events guaranteed in encrypted storage)
2097
+ - ✅ **Flexible rules** (lambdas allow complex logic)
2098
+ - ✅ **80% cost savings** (debug logs → stdout, audit → cold storage)
2099
+
2100
+ ### 6.3. Event::Base Extension
2101
+
2102
+ ```ruby
2103
+ # lib/e11y/event/base.rb
2104
+
2034
2105
  module E11y
2035
- module Cost
2036
- class RetentionTagger
2037
- def initialize(config)
2038
- @retention_rules = config.retention_rules
2039
- end
2040
-
2041
- def tag_event(event_data)
2042
- # Find matching retention rule
2043
- retention_days = determine_retention(event_data)
2044
-
2045
- # Calculate absolute retention_until date
2046
- retention_until = Time.now + retention_days.days
2047
-
2048
- # Add to event metadata
2049
- event_data[:retention_until] = retention_until.iso8601
2050
- event_data[:retention_days] = retention_days
2051
-
2052
- E11y::Metrics.histogram('e11y.retention.days', retention_days, {
2053
- event_name: event_data[:event_name]
2054
- })
2106
+ module Event
2107
+ class Base
2108
+ # DSL: declare retention period
2109
+ def retention_period(value = nil)
2110
+ @retention_period = value if value
2111
+ @retention_period ||
2112
+ superclass.retention_period ||
2113
+ E11y.configuration.default_retention_period ||
2114
+ 30.days
2115
+ end
2116
+
2117
+ def track(**payload)
2118
+ event_hash = {
2119
+ event_name: event_name,
2120
+ payload: payload,
2121
+ severity: severity,
2122
+ version: version,
2123
+ adapters: adapters, # Can be nil (use routing)
2124
+ timestamp: Time.now.utc.iso8601(3),
2125
+ retention_until: (Time.now + retention_period).iso8601, # ← Auto-calculated
2126
+ audit_event: audit_event? # ← For routing rules
2127
+ }
2055
2128
 
2056
- event_data
2129
+ E11y::Pipeline.process(event_hash)
2130
+ end
2131
+ end
2132
+ end
2133
+ end
2134
+ ```
2135
+
2136
+ ### 6.4. Routing Middleware
2137
+
2138
+ ```ruby
2139
+ # lib/e11y/middleware/routing.rb
2140
+
2141
+ module E11y
2142
+ module Middleware
2143
+ class Routing < Base
2144
+ def initialize(rules: [], fallback_adapters: [])
2145
+ @rules = rules
2146
+ @fallback_adapters = fallback_adapters
2057
2147
  end
2058
2148
 
2059
- private
2060
-
2061
- def determine_retention(event_data)
2062
- # Priority 1: Explicit retention in payload
2063
- return event_data[:payload][:retention_days] if event_data[:payload][:retention_days]
2064
-
2065
- # Priority 2: Pattern-based rules
2066
- rule = @retention_rules.find do |r|
2067
- r.matches?(event_data)
2149
+ def call(event_hash)
2150
+ # 1. Explicit adapters bypass routing
2151
+ target_adapters = if event_hash[:adapters]&.any?
2152
+ event_hash[:adapters]
2153
+ else
2154
+ # 2. Apply routing rules (lambdas)
2155
+ apply_routing_rules(event_hash)
2068
2156
  end
2069
2157
 
2070
- return rule.retention_days if rule
2158
+ # 3. Write to selected adapters
2159
+ target_adapters.each do |adapter_name|
2160
+ adapter = E11y.configuration.adapters[adapter_name]
2161
+ adapter&.write(event_hash) rescue nil
2162
+ end
2071
2163
 
2072
- # Default retention
2073
- 30 # 30 days
2164
+ next_middleware&.call(event_hash)
2074
2165
  end
2075
- end
2076
-
2077
- class RetentionRule
2078
- attr_reader :retention_days
2079
2166
 
2080
- def initialize(retention_days:, &condition)
2081
- @retention_days = retention_days
2082
- @condition = condition
2083
- end
2167
+ private
2084
2168
 
2085
- def matches?(event_data)
2086
- @condition.call(event_data)
2169
+ def apply_routing_rules(event_hash)
2170
+ matched = []
2171
+ @rules.each do |rule|
2172
+ result = rule.call(event_hash)
2173
+ matched.concat(Array(result)) if result
2174
+ end
2175
+ matched.any? ? matched.uniq : @fallback_adapters
2087
2176
  end
2088
2177
  end
2089
2178
  end
2090
2179
  end
2091
2180
  ```
2092
2181
 
2093
- ### 6.2. Configuration
2182
+ ### 6.5. Configuration Examples
2183
+
2184
+ **Example 1: Simple Tiered Routing**
2094
2185
 
2095
2186
  ```ruby
2096
2187
  E11y.configure do |config|
2097
- config.cost_optimization.tiered_storage do
2098
- # Rule 1: Audit events → 7 years (compliance)
2099
- retention_rule 2555 do |event| # 7 * 365 days
2100
- event[:event_name].start_with?('audit.')
2101
- end
2102
-
2103
- # Rule 2: Payment events → 2 years (legal)
2104
- retention_rule 730 do |event| # 2 * 365 days
2105
- event[:event_name].start_with?('payment.')
2106
- end
2188
+ config.default_retention_period = 30.days
2189
+
2190
+ config.routing_rules = [
2191
+ ->(event) { :audit_encrypted if event[:audit_event] },
2192
+ ->(event) {
2193
+ days = (Time.parse(event[:retention_until]) - Time.now) / 86400
2194
+ case days
2195
+ when 0..30 then :loki # Hot storage
2196
+ when 31..90 then :s3_standard # Warm storage
2197
+ else :s3_glacier # Cold storage
2198
+ end
2199
+ }
2200
+ ]
2201
+
2202
+ config.fallback_adapters = [:stdout]
2203
+ end
2204
+ ```
2205
+
2206
+ **Example 2: Complex Business Rules**
2207
+
2208
+ ```ruby
2209
+ E11y.configure do |config|
2210
+ config.routing_rules = [
2211
+ # Priority 1: Audit events
2212
+ ->(event) { :audit_encrypted if event[:audit_event] },
2107
2213
 
2108
- # Rule 3: Debug events → 7 days (troubleshooting)
2109
- retention_rule 7 do |event|
2110
- event[:severity] == :debug
2111
- end
2214
+ # Priority 2: Errors
2215
+ ->(event) { [:sentry, :loki] if event[:severity] == :error },
2112
2216
 
2113
- # Rule 4: Errors → 90 days (analysis)
2114
- retention_rule 90 do |event|
2115
- event[:severity] >= :error
2116
- end
2217
+ # Priority 3: High-value payments
2218
+ ->(event) {
2219
+ if event[:event_name] == "payment.completed" &&
2220
+ event[:payload][:amount] > 10000
2221
+ [:audit_encrypted, :loki] # Dual storage
2222
+ end
2223
+ },
2117
2224
 
2118
- # Default: 30 days
2119
- default_retention 30
2120
- end
2225
+ # Priority 4: Retention-based routing
2226
+ ->(event) {
2227
+ days = (Time.parse(event[:retention_until]) - Time.now) / 86400
2228
+ days > 90 ? :s3_glacier : :loki
2229
+ }
2230
+ ]
2121
2231
  end
2122
2232
  ```
2123
2233
 
2124
- ### 6.3. Downstream Integration
2234
+ ### 6.6. Cost Comparison
2125
2235
 
2126
- **Elasticsearch ILM:**
2236
+ | Scenario | Before (Manual) | After (Routing) | Savings |
2237
+ |----------|----------------|----------------|---------|
2238
+ | **Debug logs (7d retention)** | Loki (30d): $500/mo | stdout: $0/mo | **100%** |
2239
+ | **Business events (30d)** | Loki: $100/mo | Loki: $100/mo | 0% |
2240
+ | **Audit (7y retention)** | Loki: $5000/mo | S3 Glacier: $50/mo | **99%** |
2241
+ | **Total** | $5,600/mo | $150/mo | **97.3%** |
2127
2242
 
2128
- ```json
2129
- {
2130
- "policy": {
2131
- "phases": {
2132
- "hot": {
2133
- "actions": {}
2134
- },
2135
- "delete": {
2136
- "min_age": "0d",
2137
- "actions": {
2138
- "delete": {
2139
- "delete_searchable_snapshot": false
2140
- }
2141
- }
2142
- }
2143
- }
2144
- }
2145
- }
2146
- ```
2243
+ **Key Savings:**
2244
+ - ✅ Debug logs → stdout (free)
2245
+ - ✅ Audit logs → S3 Glacier ($0.004/GB vs $0.15/GB in Loki)
2246
+ - ✅ Automatic tiering → no manual intervention
2147
2247
 
2148
- **Query for deletion:**
2248
+ ### 6.7. Advantages over TieredStorage Adapter
2149
2249
 
2150
- ```ruby
2151
- # Elasticsearch query
2152
- DELETE /e11y-events-*/_query
2250
+ | Aspect | TieredStorage Adapter | Retention Routing |
2251
+ |--------|---------------------|------------------|
2252
+ | **Flexibility** | Fixed 3 tiers | Unlimited adapters via lambdas |
2253
+ | **Configuration** | Adapter-specific | Centralized rules |
2254
+ | **Migration** | Background jobs needed | No migration (routing at write time) |
2255
+ | **Cost** | All events same flow | Optimal per-event routing |
2256
+ | **Compliance** | Manual | Automatic (audit → encrypted) |
2257
+ | **Maintenance** | Complex adapter code | Simple lambda rules |
2258
+
2259
+ **Decision:** ✅ Use Retention Routing, remove TieredStorage adapter.
2260
+
2261
+ ### 6.8. Downstream Storage Integration
2262
+
2263
+ **Loki (Hot Storage - 30 days):**
2264
+ ```yaml
2265
+ # loki-config.yaml
2266
+ limits_config:
2267
+ retention_period: 720h # 30 days
2268
+ ```
2269
+
2270
+ **S3 Standard (Warm Storage - 90 days):**
2271
+ ```json
2153
2272
  {
2154
- "query": {
2155
- "range": {
2156
- "retention_until": {
2157
- "lt": "now"
2158
- }
2159
- }
2160
- }
2273
+ "Rules": [{
2274
+ "Id": "transition-to-glacier",
2275
+ "Status": "Enabled",
2276
+ "Transitions": [{
2277
+ "Days": 90,
2278
+ "StorageClass": "GLACIER"
2279
+ }]
2280
+ }]
2161
2281
  }
2162
2282
  ```
2163
2283
 
2164
- **S3 Lifecycle:**
2165
-
2284
+ **S3 Glacier (Cold Storage - 7 years):**
2166
2285
  ```json
2167
2286
  {
2168
- "Rules": [
2169
- {
2170
- "Id": "delete-expired-events",
2171
- "Status": "Enabled",
2172
- "Filter": {
2173
- "Prefix": "events/"
2174
- },
2175
- "Expiration": {
2176
- "Days": 365
2177
- }
2287
+ "Rules": [{
2288
+ "Id": "delete-after-7-years",
2289
+ "Status": "Enabled",
2290
+ "Expiration": {
2291
+ "Days": 2555
2178
2292
  }
2179
- ]
2293
+ }]
2180
2294
  }
2181
2295
  ```
2182
2296
 
2297
+ **Key Point:** E11y routes events to correct storage tier, downstream systems (Loki, S3) handle actual retention policies.
2298
+
2183
2299
  ---
2184
2300
 
2185
2301
  ## 7. Payload Minimization