e11y 0.2.0 → 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.rubocop.yml +130 -10
- data/CHANGELOG.md +56 -1
- data/CLAUDE.md +168 -0
- data/CONTRIBUTING.md +640 -0
- data/README.md +134 -702
- data/RELEASE.md +18 -3
- data/Rakefile +108 -29
- data/config/README.md +1 -1
- data/config/loki-local-config.yaml +12 -0
- data/config/otel-collector-config.yaml +44 -0
- data/cucumber.yml +1 -0
- data/docker-compose.yml +18 -2
- data/docs/ADAPTERS.md +76 -0
- data/docs/ADAPTIVE_SAMPLING.md +59 -0
- data/docs/COMPARISON.md +104 -0
- data/docs/CONFIGURATION.md +52 -0
- data/docs/DISTRIBUTED_TRACING.md +44 -0
- data/docs/LIMITATIONS.md +13 -0
- data/docs/METRICS_DSL.md +84 -0
- data/docs/PERFORMANCE.md +60 -0
- data/docs/PII_FILTERING.md +40 -0
- data/docs/PRESETS.md +65 -0
- data/docs/QUICK-START.md +546 -587
- data/docs/RAILS_INTEGRATION.md +29 -0
- data/docs/SCHEMA_VALIDATION.md +63 -0
- data/docs/SLO-PROMQL-ALERTS.md +161 -0
- data/docs/TESTING.md +69 -0
- data/docs/{ADR-001-architecture.md → architecture/ADR-001-architecture.md} +35 -64
- data/docs/{ADR-002-metrics-yabeda.md → architecture/ADR-002-metrics-yabeda.md} +62 -236
- data/docs/{ADR-003-slo-observability.md → architecture/ADR-003-slo-observability.md} +27 -466
- data/docs/{ADR-004-adapter-architecture.md → architecture/ADR-004-adapter-architecture.md} +163 -146
- data/docs/{ADR-005-tracing-context.md → architecture/ADR-005-tracing-context.md} +10 -9
- data/docs/{ADR-006-security-compliance.md → architecture/ADR-006-security-compliance.md} +184 -191
- data/docs/{ADR-007-opentelemetry-integration.md → architecture/ADR-007-opentelemetry-integration.md} +3 -21
- data/docs/{ADR-008-rails-integration.md → architecture/ADR-008-rails-integration.md} +209 -339
- data/docs/{ADR-009-cost-optimization.md → architecture/ADR-009-cost-optimization.md} +45 -54
- data/docs/architecture/ADR-010-developer-experience.md +522 -0
- data/docs/{ADR-011-testing-strategy.md → architecture/ADR-011-testing-strategy.md} +41 -83
- data/docs/{ADR-013-reliability-error-handling.md → architecture/ADR-013-reliability-error-handling.md} +37 -12
- data/docs/{ADR-014-event-driven-slo.md → architecture/ADR-014-event-driven-slo.md} +12 -24
- data/docs/{ADR-015-middleware-order.md → architecture/ADR-015-middleware-order.md} +23 -41
- data/docs/{ADR-016-self-monitoring-slo.md → architecture/ADR-016-self-monitoring-slo.md} +52 -349
- data/docs/{ADR-017-multi-rails-compatibility.md → architecture/ADR-017-multi-rails-compatibility.md} +4 -11
- data/docs/architecture/ADR-018-memory-optimization.md +366 -0
- data/docs/{ADR-INDEX.md → architecture/ADR-INDEX.md} +11 -6
- data/docs/{00-ICP-AND-TIMELINE.md → prd/00-ICP-AND-TIMELINE.md} +6 -6
- data/docs/{01-SCALE-REQUIREMENTS.md → prd/01-SCALE-REQUIREMENTS.md} +6 -6
- data/docs/prd/01-overview-vision.md +19 -14
- data/docs/use_cases/README.md +22 -23
- data/docs/use_cases/UC-001-request-scoped-debug-buffering.md +50 -44
- data/docs/use_cases/UC-002-business-event-tracking.md +26 -95
- data/docs/use_cases/UC-003-event-metrics.md +66 -0
- data/docs/use_cases/UC-004-zero-config-slo-tracking.md +42 -101
- data/docs/use_cases/UC-005-sentry-integration.md +13 -15
- data/docs/use_cases/UC-006-trace-context-management.md +30 -28
- data/docs/use_cases/UC-007-pii-filtering.md +35 -87
- data/docs/use_cases/UC-008-opentelemetry-integration.md +51 -89
- data/docs/use_cases/UC-009-multi-service-tracing.md +4 -4
- data/docs/use_cases/UC-010-background-job-tracking.md +5 -5
- data/docs/use_cases/UC-011-rate-limiting.md +95 -168
- data/docs/use_cases/UC-012-audit-trail.md +21 -46
- data/docs/use_cases/UC-013-high-cardinality-protection.md +29 -167
- data/docs/use_cases/UC-014-adaptive-sampling.md +2 -2
- data/docs/use_cases/UC-015-cost-optimization.md +46 -99
- data/docs/use_cases/UC-016-rails-logger-migration.md +39 -213
- data/docs/use_cases/UC-017-local-development.md +203 -777
- data/docs/use_cases/UC-018-testing-events.md +3 -3
- data/docs/use_cases/UC-019-retention-based-routing.md +53 -106
- data/docs/use_cases/UC-020-event-versioning.md +8 -9
- data/docs/use_cases/UC-021-error-handling-retry-dlq.md +18 -22
- data/docs/use_cases/UC-022-event-registry.md +15 -21
- data/docs/use_cases/backlog.md +119 -87
- data/e11y.gemspec +2 -2
- data/gems/e11y-devtools/README.md +136 -0
- data/gems/e11y-devtools/config/routes.rb +8 -0
- data/gems/e11y-devtools/e11y-devtools.gemspec +25 -0
- data/gems/e11y-devtools/exe/e11y +34 -0
- data/gems/e11y-devtools/lib/e11y/devtools/mcp/server.rb +96 -0
- data/gems/e11y-devtools/lib/e11y/devtools/mcp/tool_base.rb +25 -0
- data/gems/e11y-devtools/lib/e11y/devtools/mcp/tools/clear.rb +31 -0
- data/gems/e11y-devtools/lib/e11y/devtools/mcp/tools/errors.rb +35 -0
- data/gems/e11y-devtools/lib/e11y/devtools/mcp/tools/event_detail.rb +33 -0
- data/gems/e11y-devtools/lib/e11y/devtools/mcp/tools/events_by_trace.rb +33 -0
- data/gems/e11y-devtools/lib/e11y/devtools/mcp/tools/interactions.rb +40 -0
- data/gems/e11y-devtools/lib/e11y/devtools/mcp/tools/recent_events.rb +34 -0
- data/gems/e11y-devtools/lib/e11y/devtools/mcp/tools/search.rb +34 -0
- data/gems/e11y-devtools/lib/e11y/devtools/mcp/tools/stats.rb +30 -0
- data/gems/e11y-devtools/lib/e11y/devtools/overlay/assets/overlay.js +115 -0
- data/gems/e11y-devtools/lib/e11y/devtools/overlay/controller.rb +54 -0
- data/gems/e11y-devtools/lib/e11y/devtools/overlay/engine.rb +26 -0
- data/gems/e11y-devtools/lib/e11y/devtools/overlay/middleware.rb +80 -0
- data/gems/e11y-devtools/lib/e11y/devtools/overlay/rails_controller.rb +42 -0
- data/gems/e11y-devtools/lib/e11y/devtools/tui/app.rb +262 -0
- data/gems/e11y-devtools/lib/e11y/devtools/tui/grouping.rb +66 -0
- data/gems/e11y-devtools/lib/e11y/devtools/tui/widgets/event_detail.rb +62 -0
- data/gems/e11y-devtools/lib/e11y/devtools/tui/widgets/event_list.rb +70 -0
- data/gems/e11y-devtools/lib/e11y/devtools/tui/widgets/interaction_list.rb +47 -0
- data/gems/e11y-devtools/lib/e11y/devtools/version.rb +8 -0
- data/gems/e11y-devtools/lib/e11y/devtools.rb +13 -0
- data/gems/e11y-devtools/spec/e11y/devtools/mcp/tools_spec.rb +107 -0
- data/gems/e11y-devtools/spec/e11y/devtools/overlay/controller_spec.rb +58 -0
- data/gems/e11y-devtools/spec/e11y/devtools/overlay/middleware_spec.rb +46 -0
- data/gems/e11y-devtools/spec/e11y/devtools/tui/app_spec.rb +85 -0
- data/gems/e11y-devtools/spec/e11y/devtools/tui/grouping_spec.rb +64 -0
- data/gems/e11y-devtools/spec/spec_helper.rb +5 -0
- data/gems/e11y-devtools/spec/tui/widgets/event_list_spec.rb +44 -0
- data/gems/e11y-devtools/spec/tui/widgets/interaction_list_spec.rb +62 -0
- data/lib/e11y/adapters/audit_encrypted.rb +53 -11
- data/lib/e11y/adapters/base.rb +33 -34
- data/lib/e11y/adapters/dev_log/file_store.rb +143 -0
- data/lib/e11y/adapters/dev_log/query.rb +219 -0
- data/lib/e11y/adapters/dev_log.rb +118 -0
- data/lib/e11y/adapters/file.rb +3 -6
- data/lib/e11y/adapters/in_memory.rb +52 -5
- data/lib/e11y/adapters/in_memory_test.rb +29 -0
- data/lib/e11y/adapters/loki.rb +58 -23
- data/lib/e11y/adapters/null.rb +82 -0
- data/lib/e11y/adapters/opentelemetry_collector.rb +183 -0
- data/lib/e11y/adapters/otel_logs.rb +136 -23
- data/lib/e11y/adapters/sentry.rb +4 -7
- data/lib/e11y/adapters/stdout.rb +73 -7
- data/lib/e11y/adapters/yabeda.rb +153 -29
- data/lib/e11y/buffers/adaptive_buffer.rb +3 -17
- data/lib/e11y/buffers/{request_scoped_buffer.rb → ephemeral_buffer.rb} +72 -58
- data/lib/e11y/buffers/ring_buffer.rb +3 -16
- data/lib/e11y/configuration.rb +272 -0
- data/lib/e11y/console.rb +10 -17
- data/lib/e11y/current.rb +53 -1
- data/lib/e11y/debug/pipeline_inspector.rb +96 -0
- data/lib/e11y/documentation/generator.rb +48 -0
- data/lib/e11y/event/base.rb +176 -82
- data/lib/e11y/event/value_sampling_config.rb +1 -5
- data/lib/e11y/events/rails/database/query.rb +1 -4
- data/lib/e11y/events/rails/job/failed.rb +2 -0
- data/lib/e11y/instruments/active_job.rb +46 -12
- data/lib/e11y/instruments/rails_instrumentation.rb +49 -24
- data/lib/e11y/instruments/sidekiq.rb +137 -31
- data/lib/e11y/linters/base.rb +11 -0
- data/lib/e11y/linters/pii/pii_declaration_linter.rb +120 -0
- data/lib/e11y/linters/slo/config_consistency_linter.rb +76 -0
- data/lib/e11y/linters/slo/explicit_declaration_linter.rb +36 -0
- data/lib/e11y/linters/slo/slo_status_from_linter.rb +41 -0
- data/lib/e11y/logger/bridge.rb +26 -7
- data/lib/e11y/metrics/cardinality_protection.rb +10 -15
- data/lib/e11y/metrics/cardinality_tracker.rb +16 -6
- data/lib/e11y/metrics/registry.rb +3 -5
- data/lib/e11y/metrics/test_backend.rb +62 -0
- data/lib/e11y/metrics.rb +56 -10
- data/lib/e11y/middleware/adapter_resolver.rb +40 -0
- data/lib/e11y/middleware/audit_signing.rb +43 -6
- data/lib/e11y/middleware/baggage_protection.rb +75 -0
- data/lib/e11y/middleware/dev_log_source.rb +24 -0
- data/lib/e11y/middleware/event_slo.rb +23 -9
- data/lib/e11y/middleware/otel_span.rb +23 -0
- data/lib/e11y/middleware/pii_filter.rb +104 -75
- data/lib/e11y/middleware/rate_limiting.rb +54 -27
- data/lib/e11y/middleware/request.rb +70 -23
- data/lib/e11y/middleware/routing.rb +78 -21
- data/lib/e11y/middleware/sampling.rb +66 -17
- data/lib/e11y/middleware/self_monitoring_emit.rb +39 -0
- data/lib/e11y/middleware/trace_context.rb +45 -10
- data/lib/e11y/middleware/track_latency.rb +34 -0
- data/lib/e11y/middleware/validation.rb +7 -16
- data/lib/e11y/middleware/versioning.rb +26 -22
- data/lib/e11y/opentelemetry/semantic_conventions.rb +109 -0
- data/lib/e11y/opentelemetry/span_creator.rb +142 -0
- data/lib/e11y/pii/patterns.rb +12 -1
- data/lib/e11y/pipeline/builder.rb +1 -1
- data/lib/e11y/presets/audit_event.rb +13 -2
- data/lib/e11y/railtie.rb +52 -15
- data/lib/e11y/registry.rb +306 -0
- data/lib/e11y/reliability/circuit_breaker.rb +19 -21
- data/lib/e11y/reliability/dlq/base.rb +71 -0
- data/lib/e11y/reliability/dlq/file_adapter.rb +301 -0
- data/lib/e11y/reliability/dlq/file_storage.rb +63 -34
- data/lib/e11y/reliability/dlq/filter.rb +37 -54
- data/lib/e11y/reliability/retry_handler.rb +26 -29
- data/lib/e11y/reliability/retry_rate_limiter.rb +3 -11
- data/lib/e11y/sampling/error_spike_detector.rb +0 -2
- data/lib/e11y/sampling/load_monitor.rb +5 -9
- data/lib/e11y/sampling/stratified_tracker.rb +18 -0
- data/lib/e11y/self_monitoring/buffer_monitor.rb +2 -0
- data/lib/e11y/self_monitoring/performance_monitor.rb +19 -61
- data/lib/e11y/self_monitoring/reliability_monitor.rb +4 -74
- data/lib/e11y/slo/config_loader.rb +40 -0
- data/lib/e11y/slo/config_validator.rb +58 -0
- data/lib/e11y/slo/dashboard_generator.rb +122 -0
- data/lib/e11y/slo/event_driven.rb +8 -0
- data/lib/e11y/slo/tracker.rb +31 -4
- data/lib/e11y/testing/have_tracked_event_matcher.rb +190 -0
- data/lib/e11y/testing/rspec_matchers.rb +21 -0
- data/lib/e11y/testing/snapshot_matcher.rb +86 -0
- data/lib/e11y/trace_context/sampler.rb +35 -0
- data/lib/e11y/tracing/faraday_middleware.rb +31 -0
- data/lib/e11y/tracing/net_http_patch.rb +33 -0
- data/lib/e11y/tracing/propagator.rb +116 -0
- data/lib/e11y/tracing.rb +47 -0
- data/lib/e11y/version.rb +1 -1
- data/lib/e11y/versioning/version_extractor.rb +32 -0
- data/lib/e11y.rb +141 -265
- data/lib/generators/e11y/event/event_generator.rb +22 -0
- data/lib/generators/e11y/event/templates/event.rb.tt +16 -0
- data/lib/generators/e11y/grafana_dashboard/grafana_dashboard_generator.rb +30 -0
- data/lib/generators/e11y/grafana_dashboard/templates/e11y_dashboard.json +81 -0
- data/lib/generators/e11y/install/install_generator.rb +34 -0
- data/lib/generators/e11y/install/templates/e11y.rb +239 -0
- data/lib/generators/e11y/prometheus_alerts/prometheus_alerts_generator.rb +29 -0
- data/lib/generators/e11y/prometheus_alerts/templates/e11y_alerts.yml +28 -0
- data/lib/tasks/e11y_docs.rake +30 -0
- data/lib/tasks/e11y_events.rake +71 -0
- data/lib/tasks/e11y_lint.rake +91 -0
- data/lib/tasks/e11y_slo.rake +29 -0
- metadata +129 -39
- data/docs/ADR-010-developer-experience.md +0 -2166
- data/docs/API-REFERENCE-L28.md +0 -914
- data/docs/COMPREHENSIVE-CONFIGURATION.md +0 -2366
- data/docs/CONTRIBUTING.md +0 -312
- data/docs/IMPLEMENTATION_NOTES.md +0 -2804
- data/docs/IMPLEMENTATION_PLAN.md +0 -1971
- data/docs/IMPLEMENTATION_PLAN_ARCHITECTURE.md +0 -586
- data/docs/PLAN.md +0 -148
- data/docs/README.md +0 -296
- data/docs/design/00-memory-optimization.md +0 -593
- data/docs/guides/MIGRATION-L27-L28.md +0 -692
- data/docs/guides/PERFORMANCE-BENCHMARKS.md +0 -434
- data/docs/guides/README.md +0 -44
- data/docs/use_cases/UC-003-pattern-based-metrics.md +0 -1627
- data/lib/e11y/adapters/registry.rb +0 -141
- /data/docs/{ADR-012-event-evolution.md → architecture/ADR-012-event-evolution.md} +0 -0
|
@@ -381,7 +381,7 @@ end
|
|
|
381
381
|
|
|
382
382
|
### Layer 4: DLQ Filter Integration (C02 Resolution) ⚠️
|
|
383
383
|
|
|
384
|
-
> **Reference:** See [ADR-013 §4.6: Rate Limiting × DLQ Filter](../ADR-013-reliability-error-handling.md#46-rate-limiting--dlq-filter-interaction-c02-resolution) for full architecture.
|
|
384
|
+
> **Reference:** See [ADR-013 §4.6: Rate Limiting × DLQ Filter](../architecture/ADR-013-reliability-error-handling.md#46-rate-limiting--dlq-filter-interaction-c02-resolution) for full architecture.
|
|
385
385
|
|
|
386
386
|
**Problem:** Rate limiting drops events BEFORE they reach DLQ filter. Critical events (e.g., payments) may be lost during traffic spikes, even though DLQ filter says "always save payments".
|
|
387
387
|
|
|
@@ -472,7 +472,7 @@ end
|
|
|
472
472
|
|
|
473
473
|
### Layer 5: Retry Rate Limiting (C06 Resolution) ⚠️
|
|
474
474
|
|
|
475
|
-
> **Reference:** See [ADR-013 §3.5: Retry Rate Limiting](../ADR-013-reliability-error-handling.md#35-retry-rate-limiting-c06-resolution) for full architecture.
|
|
475
|
+
> **Reference:** See [ADR-013 §3.5: Retry Rate Limiting](../architecture/ADR-013-reliability-error-handling.md#35-retry-rate-limiting-c06-resolution) for full architecture.
|
|
476
476
|
|
|
477
477
|
**Problem:** Adapter failures trigger retries. If 1000 events fail → 3000 retry attempts (thundering herd) → buffer overflow.
|
|
478
478
|
|
|
@@ -732,130 +732,85 @@ end
|
|
|
732
732
|
|
|
733
733
|
---
|
|
734
734
|
|
|
735
|
-
## 📊 Implementation
|
|
735
|
+
## 📊 Implementation: In-Memory Token Bucket
|
|
736
736
|
|
|
737
|
-
**
|
|
737
|
+
**Current implementation uses in-memory token bucket algorithm (no Redis dependency):**
|
|
738
738
|
|
|
739
739
|
```ruby
|
|
740
|
-
# lib/e11y/
|
|
740
|
+
# lib/e11y/middleware/rate_limiting.rb
|
|
741
741
|
module E11y
|
|
742
|
-
module
|
|
743
|
-
class
|
|
744
|
-
def initialize(
|
|
745
|
-
|
|
746
|
-
@
|
|
747
|
-
|
|
748
|
-
|
|
749
|
-
|
|
750
|
-
#
|
|
751
|
-
|
|
752
|
-
|
|
753
|
-
|
|
754
|
-
|
|
755
|
-
|
|
756
|
-
|
|
757
|
-
|
|
758
|
-
|
|
759
|
-
|
|
760
|
-
|
|
761
|
-
|
|
762
|
-
true
|
|
763
|
-
end
|
|
764
|
-
|
|
765
|
-
private
|
|
766
|
-
|
|
767
|
-
def check_global_limit(event)
|
|
768
|
-
key = 'e11y:rate_limit:global'
|
|
769
|
-
limit = @config.global_limit
|
|
770
|
-
window = @config.global_window
|
|
771
|
-
|
|
772
|
-
check_limit(key, limit, window)
|
|
773
|
-
end
|
|
774
|
-
|
|
775
|
-
def check_per_event_limit(event)
|
|
776
|
-
limit_config = @config.per_event_limits[event.event_name]
|
|
777
|
-
return true unless limit_config
|
|
778
|
-
|
|
779
|
-
key = "e11y:rate_limit:event:#{event.event_name}"
|
|
780
|
-
check_limit(key, limit_config[:limit], limit_config[:window])
|
|
781
|
-
end
|
|
782
|
-
|
|
783
|
-
def check_per_context_limits(event)
|
|
784
|
-
@config.per_context_limits.all? do |field, limit_config|
|
|
785
|
-
value = extract_context_value(event, field, limit_config[:extractor])
|
|
786
|
-
next true unless value
|
|
787
|
-
|
|
788
|
-
key = "e11y:rate_limit:context:#{field}:#{value}"
|
|
789
|
-
check_limit(key, limit_config[:limit], limit_config[:window])
|
|
742
|
+
module Middleware
|
|
743
|
+
class RateLimiting < Base
|
|
744
|
+
def initialize(app, global_limit: 10_000, per_event_limit: 1_000, window: 1.0)
|
|
745
|
+
super(app)
|
|
746
|
+
@global_limit = global_limit
|
|
747
|
+
@per_event_limit = per_event_limit
|
|
748
|
+
@window = window
|
|
749
|
+
|
|
750
|
+
# Token buckets for rate limiting (in-memory)
|
|
751
|
+
@global_bucket = TokenBucket.new(
|
|
752
|
+
capacity: @global_limit,
|
|
753
|
+
refill_rate: @global_limit,
|
|
754
|
+
window: @window
|
|
755
|
+
)
|
|
756
|
+
@per_event_buckets = Hash.new do |hash, event_name|
|
|
757
|
+
hash[event_name] = TokenBucket.new(
|
|
758
|
+
capacity: @per_event_limit,
|
|
759
|
+
refill_rate: @per_event_limit,
|
|
760
|
+
window: @window
|
|
761
|
+
)
|
|
790
762
|
end
|
|
763
|
+
|
|
764
|
+
@mutex = Mutex.new
|
|
791
765
|
end
|
|
792
|
-
|
|
793
|
-
def
|
|
794
|
-
|
|
795
|
-
|
|
796
|
-
|
|
797
|
-
|
|
798
|
-
|
|
799
|
-
|
|
800
|
-
|
|
801
|
-
# Count current entries
|
|
802
|
-
current_count = @redis.zcard(key)
|
|
803
|
-
|
|
804
|
-
if current_count < limit
|
|
805
|
-
# Add new entry
|
|
806
|
-
@redis.zadd(key, now, "#{now}-#{SecureRandom.hex(8)}")
|
|
807
|
-
@redis.expire(key, window.to_i + 60) # TTL = window + buffer
|
|
808
|
-
true
|
|
809
|
-
else
|
|
810
|
-
# Limit exceeded
|
|
811
|
-
handle_exceeded(key, current_count, limit)
|
|
812
|
-
false
|
|
766
|
+
|
|
767
|
+
def call(event_data)
|
|
768
|
+
event_name = event_data[:event_name]
|
|
769
|
+
|
|
770
|
+
# Check global rate limit
|
|
771
|
+
unless @global_bucket.allow?
|
|
772
|
+
handle_rate_limited(event_data, :global)
|
|
773
|
+
return nil
|
|
813
774
|
end
|
|
814
|
-
|
|
815
|
-
|
|
816
|
-
|
|
817
|
-
|
|
818
|
-
|
|
819
|
-
|
|
820
|
-
key: key
|
|
821
|
-
)
|
|
822
|
-
|
|
823
|
-
# Log warning
|
|
824
|
-
E11y.logger.warn(
|
|
825
|
-
"[E11y] Rate limit exceeded: #{key} (#{current}/#{limit})"
|
|
826
|
-
)
|
|
827
|
-
|
|
828
|
-
# Alert if configured
|
|
829
|
-
if @config.alert_on_limit
|
|
830
|
-
alert_rate_limit_exceeded(key, current, limit)
|
|
775
|
+
|
|
776
|
+
# Check per-event rate limit
|
|
777
|
+
per_event_bucket = @mutex.synchronize { @per_event_buckets[event_name] }
|
|
778
|
+
unless per_event_bucket.allow?
|
|
779
|
+
handle_rate_limited(event_data, :per_event)
|
|
780
|
+
return nil
|
|
831
781
|
end
|
|
782
|
+
|
|
783
|
+
# Rate limit not exceeded - continue pipeline
|
|
784
|
+
event_data
|
|
832
785
|
end
|
|
833
|
-
|
|
834
|
-
|
|
835
|
-
|
|
836
|
-
|
|
837
|
-
|
|
838
|
-
|
|
839
|
-
|
|
840
|
-
|
|
841
|
-
rule[:values].include?(event.severity)
|
|
842
|
-
when :contexts
|
|
843
|
-
rule[:values].all? { |k, v| event.context[k] == v }
|
|
844
|
-
when :custom
|
|
845
|
-
rule[:condition].call(event)
|
|
846
|
-
end
|
|
847
|
-
end
|
|
786
|
+
|
|
787
|
+
private
|
|
788
|
+
|
|
789
|
+
def handle_rate_limited(event_data, limit_type)
|
|
790
|
+
# C02 Resolution: Check if event should be saved to DLQ
|
|
791
|
+
return unless should_save_to_dlq?(event_data)
|
|
792
|
+
|
|
793
|
+
save_to_dlq(event_data, limit_type)
|
|
848
794
|
end
|
|
849
795
|
end
|
|
850
796
|
end
|
|
851
797
|
end
|
|
852
798
|
```
|
|
853
799
|
|
|
800
|
+
**Why In-Memory Token Bucket?**
|
|
801
|
+
- ✅ **Fast:** No network latency (O(1) operations)
|
|
802
|
+
- ✅ **Simple:** No external dependencies (Redis not required)
|
|
803
|
+
- ✅ **Thread-safe:** Mutex-protected token buckets
|
|
804
|
+
- ✅ **Smooth rate limiting:** Token bucket avoids bursty behavior
|
|
805
|
+
- ⚠️ **Trade-off:** Per-process limits (not shared across instances)
|
|
806
|
+
|
|
807
|
+
**Note:** In-memory rate limiting is sufficient for most use cases. Each application process maintains its own rate limits, which is appropriate for event tracking workloads.
|
|
808
|
+
|
|
854
809
|
---
|
|
855
810
|
|
|
856
811
|
## 🔧 Implementation Details
|
|
857
812
|
|
|
858
|
-
> **Implementation:** See [ADR-006 Section 4.0: Rate Limiting + Retry Policy Resolution](../ADR-006-security-compliance.md#40-rate-limiting--retry-policy-resolution-conflict-14) for detailed architecture.
|
|
813
|
+
> **Implementation:** See [ADR-006 Section 4.0: Rate Limiting + Retry Policy Resolution](../architecture/ADR-006-security-compliance.md#40-rate-limiting--retry-policy-resolution-conflict-14) for detailed architecture.
|
|
859
814
|
|
|
860
815
|
### Middleware Flow
|
|
861
816
|
|
|
@@ -980,53 +935,27 @@ end
|
|
|
980
935
|
|
|
981
936
|
---
|
|
982
937
|
|
|
983
|
-
###
|
|
938
|
+
### Token Bucket Algorithm
|
|
984
939
|
|
|
985
|
-
E11y uses **
|
|
940
|
+
E11y uses **in-memory token bucket algorithm** for rate limiting.
|
|
986
941
|
|
|
987
|
-
**
|
|
942
|
+
**How Token Bucket Works:**
|
|
943
|
+
- Each bucket has a **capacity** (max tokens) and **refill rate** (tokens per second)
|
|
944
|
+
- When event arrives: Check if token available → consume token if yes
|
|
945
|
+
- Tokens refill continuously based on elapsed time
|
|
946
|
+
- Allows burst traffic up to capacity, then smooth rate limiting
|
|
988
947
|
|
|
989
|
-
|
|
990
|
-
|
|
991
|
-
|
|
992
|
-
|
|
993
|
-
|
|
994
|
-
|
|
995
|
-
redis.zremrangebyscore(key, 0, window_start)
|
|
996
|
-
|
|
997
|
-
# 2. Count current entries
|
|
998
|
-
current_count = redis.zcard(key)
|
|
999
|
-
|
|
1000
|
-
# 3. Check limit
|
|
1001
|
-
if current_count < limit
|
|
1002
|
-
# Add new entry (score = timestamp, member = unique ID)
|
|
1003
|
-
redis.zadd(key, now, "#{now}-#{SecureRandom.hex(8)}")
|
|
1004
|
-
redis.expire(key, window.to_i + 60) # TTL cleanup
|
|
1005
|
-
true # Allowed
|
|
1006
|
-
else
|
|
1007
|
-
false # Rate limited
|
|
1008
|
-
end
|
|
1009
|
-
end
|
|
1010
|
-
```
|
|
948
|
+
**Why Token Bucket?**
|
|
949
|
+
- ✅ **Smooth rate limiting:** Avoids bursty behavior
|
|
950
|
+
- ✅ **Burst support:** Allows traffic spikes up to capacity
|
|
951
|
+
- ✅ **Fast:** O(1) operations (no external dependencies)
|
|
952
|
+
- ✅ **Industry standard:** Used by Nginx, AWS API Gateway, Google Cloud
|
|
953
|
+
- ⚠️ **Per-process limits:** Each application instance has separate limits
|
|
1011
954
|
|
|
1012
|
-
**
|
|
1013
|
-
-
|
|
1014
|
-
-
|
|
1015
|
-
-
|
|
1016
|
-
- ✅ **Automatic cleanup:** Redis TTL handles old entries
|
|
1017
|
-
|
|
1018
|
-
**Redis Keys:**
|
|
1019
|
-
```ruby
|
|
1020
|
-
# Global limit
|
|
1021
|
-
"e11y:rate_limit:global"
|
|
1022
|
-
|
|
1023
|
-
# Per-event limit
|
|
1024
|
-
"e11y:rate_limit:event:payment.retry"
|
|
1025
|
-
|
|
1026
|
-
# Per-context limit
|
|
1027
|
-
"e11y:rate_limit:context:user_id:user-123"
|
|
1028
|
-
"e11y:rate_limit:context:ip_address:192.168.1.100"
|
|
1029
|
-
```
|
|
955
|
+
**Memory Usage:**
|
|
956
|
+
- Global bucket: ~100 bytes (single TokenBucket instance)
|
|
957
|
+
- Per-event buckets: ~100 bytes per unique event type (lazy initialization)
|
|
958
|
+
- Example: 100 event types × 100 bytes = ~10KB total
|
|
1030
959
|
|
|
1031
960
|
---
|
|
1032
961
|
|
|
@@ -1095,30 +1024,28 @@ end
|
|
|
1095
1024
|
# Overhead: ~0.5μs (5% increase)
|
|
1096
1025
|
```
|
|
1097
1026
|
|
|
1098
|
-
**
|
|
1027
|
+
**In-Memory Performance:**
|
|
1099
1028
|
```ruby
|
|
1100
|
-
#
|
|
1101
|
-
# 1.
|
|
1102
|
-
# 2.
|
|
1103
|
-
# 3.
|
|
1104
|
-
#
|
|
1105
|
-
# Total: ~0.25ms per event
|
|
1029
|
+
# Token bucket operations per event (within limit):
|
|
1030
|
+
# 1. Mutex lock ~0.001ms
|
|
1031
|
+
# 2. Refill calculation ~0.001ms
|
|
1032
|
+
# 3. Token consumption ~0.001ms
|
|
1033
|
+
# Total: ~0.003ms per event (3000x faster than Redis!)
|
|
1106
1034
|
|
|
1107
1035
|
# When rate limited:
|
|
1108
|
-
# 1.
|
|
1109
|
-
# 2.
|
|
1110
|
-
#
|
|
1036
|
+
# 1. Mutex lock ~0.001ms
|
|
1037
|
+
# 2. Refill calculation ~0.001ms
|
|
1038
|
+
# 3. Check tokens (0 available) ~0.001ms
|
|
1039
|
+
# Total: ~0.003ms (no network overhead)
|
|
1111
1040
|
```
|
|
1112
1041
|
|
|
1113
1042
|
**Scaling:**
|
|
1114
1043
|
```ruby
|
|
1115
|
-
#
|
|
1116
|
-
# - Global
|
|
1117
|
-
# - Per-event
|
|
1118
|
-
# -
|
|
1119
|
-
#
|
|
1120
|
-
# Example: 1000 users × 50KB = 50MB
|
|
1121
|
-
# → Acceptable for most deployments
|
|
1044
|
+
# Memory usage (in-memory):
|
|
1045
|
+
# - Global bucket: ~100 bytes
|
|
1046
|
+
# - Per-event bucket: ~100 bytes per unique event type
|
|
1047
|
+
# - Example: 100 event types × 100 bytes = ~10KB total
|
|
1048
|
+
# → Negligible memory footprint
|
|
1122
1049
|
```
|
|
1123
1050
|
|
|
1124
1051
|
---
|
|
@@ -1198,7 +1125,7 @@ end
|
|
|
1198
1125
|
|
|
1199
1126
|
## 📊 Self-Monitoring & Metrics
|
|
1200
1127
|
|
|
1201
|
-
> **Implementation:** See [ADR-006 Section 4: Rate Limiting](../ADR-006-security-compliance.md#4-rate-limiting) for detailed architecture.
|
|
1128
|
+
> **Implementation:** See [ADR-006 Section 4: Rate Limiting](../architecture/ADR-006-security-compliance.md#4-rate-limiting) for detailed architecture.
|
|
1202
1129
|
|
|
1203
1130
|
E11y provides comprehensive self-monitoring metrics for rate limiting. These metrics help you understand rate limit behavior, detect attacks, and optimize limits.
|
|
1204
1131
|
|
|
@@ -240,7 +240,7 @@ end
|
|
|
240
240
|
> - ✅ Separate storage (isolated from app DB)
|
|
241
241
|
> - ✅ Long retention (7-10 years)
|
|
242
242
|
>
|
|
243
|
-
> **Implementation:** See [ADR-015 §3.3: Audit Event Pipeline Separation](../ADR-015-middleware-order.md#33-audit-event-pipeline-separation-c01-resolution) for full architecture.
|
|
243
|
+
> **Implementation:** See [ADR-015 §3.3: Audit Event Pipeline Separation](../architecture/ADR-015-middleware-order.md#33-audit-event-pipeline-separation-c01-resolution) for full architecture.
|
|
244
244
|
|
|
245
245
|
---
|
|
246
246
|
|
|
@@ -424,15 +424,11 @@ Events::UserDeleted.track(
|
|
|
424
424
|
E11y.configure do |config|
|
|
425
425
|
config.audit_trail do
|
|
426
426
|
# Separate storage for audit events
|
|
427
|
-
storage adapter: :postgresql, # OR :
|
|
427
|
+
storage adapter: :postgresql, # OR :file
|
|
428
428
|
table: 'audit_events',
|
|
429
429
|
read_only: true # Can't UPDATE/DELETE
|
|
430
430
|
|
|
431
|
-
#
|
|
432
|
-
# storage adapter: :s3,
|
|
433
|
-
# bucket: 'company-audit-trail',
|
|
434
|
-
# object_lock: true, # WORM (Write Once Read Many)
|
|
435
|
-
# retention_period: 7.years
|
|
431
|
+
# Object storage with WORM (external; E11y uses retention_until for archival filtering)
|
|
436
432
|
end
|
|
437
433
|
end
|
|
438
434
|
|
|
@@ -568,14 +564,13 @@ E11y.configure do |config|
|
|
|
568
564
|
|
|
569
565
|
# === ARCHIVAL ===
|
|
570
566
|
archive_after 1.year,
|
|
571
|
-
to: :
|
|
572
|
-
bucket: 'company-audit-archive'
|
|
567
|
+
to: :archive, # Cold storage (external job filters by retention_until)
|
|
573
568
|
end
|
|
574
569
|
end
|
|
575
570
|
|
|
576
571
|
# How it works:
|
|
577
|
-
# 1. Events stored in hot storage (PostgreSQL/
|
|
578
|
-
# 2. After 1 year → moved to cold storage (
|
|
572
|
+
# 1. Events stored in hot storage (PostgreSQL/File)
|
|
573
|
+
# 2. After 1 year → moved to cold storage (archival job filters by retention_until)
|
|
579
574
|
# 3. After retention period → permanently deleted
|
|
580
575
|
# 4. Deletion logged as audit event (audit the audit!)
|
|
581
576
|
```
|
|
@@ -1239,7 +1234,7 @@ end
|
|
|
1239
1234
|
|
|
1240
1235
|
## 🔧 Implementation Details
|
|
1241
1236
|
|
|
1242
|
-
> **Implementation:** See [ADR-006 Section 5: Audit Trail](../ADR-006-security-compliance.md#5-audit-trail) for detailed architecture.
|
|
1237
|
+
> **Implementation:** See [ADR-006 Section 5: Audit Trail](../architecture/ADR-006-security-compliance.md#5-audit-trail) for detailed architecture.
|
|
1243
1238
|
|
|
1244
1239
|
### Audit Middleware Architecture
|
|
1245
1240
|
|
|
@@ -1554,17 +1549,18 @@ module E11y
|
|
|
1554
1549
|
end
|
|
1555
1550
|
```
|
|
1556
1551
|
|
|
1557
|
-
**3.
|
|
1552
|
+
**3. Object Storage Audit Adapter (conceptual; not in E11y)**
|
|
1553
|
+
|
|
1554
|
+
> E11y does not provide an S3/object-storage adapter. For cloud WORM storage, use OTel Collector's object-storage exporter, or an external archival job that filters Loki by `retention_until`. Events carry `retention_until` (ISO8601) for easy filtering.
|
|
1558
1555
|
|
|
1559
1556
|
```ruby
|
|
1560
|
-
#
|
|
1557
|
+
# Conceptual: Object storage with WORM (e.g., S3 Object Lock)
|
|
1558
|
+
# E11y does NOT implement this — use external archival
|
|
1561
1559
|
module E11y
|
|
1562
1560
|
module Adapters
|
|
1563
|
-
class
|
|
1561
|
+
class ObjectStorageAudit < Base # Conceptual only
|
|
1564
1562
|
def initialize(config)
|
|
1565
1563
|
@bucket = config.bucket
|
|
1566
|
-
@s3_client = Aws::S3::Client.new
|
|
1567
|
-
@object_lock = config.object_lock || true
|
|
1568
1564
|
@retention_days = config.retention_days || 2555 # 7 years
|
|
1569
1565
|
end
|
|
1570
1566
|
|
|
@@ -1573,37 +1569,16 @@ module E11y
|
|
|
1573
1569
|
end
|
|
1574
1570
|
|
|
1575
1571
|
def write(event_data)
|
|
1576
|
-
|
|
1577
|
-
|
|
1578
|
-
|
|
1579
|
-
bucket: @bucket,
|
|
1580
|
-
key: object_key,
|
|
1581
|
-
body: event_data.to_json,
|
|
1582
|
-
content_type: 'application/json',
|
|
1583
|
-
|
|
1584
|
-
# WORM: Object Lock prevents deletion
|
|
1585
|
-
object_lock_mode: 'GOVERNANCE', # OR 'COMPLIANCE' (stricter)
|
|
1586
|
-
object_lock_retain_until_date: @retention_days.days.from_now,
|
|
1587
|
-
|
|
1588
|
-
# Metadata for audit
|
|
1589
|
-
metadata: {
|
|
1590
|
-
'event-id' => event_data[:event_id],
|
|
1591
|
-
'event-name' => event_data[:event_name],
|
|
1592
|
-
'signed-at' => event_data[:signed_at],
|
|
1593
|
-
'signature-algorithm' => event_data[:signature_algorithm]
|
|
1594
|
-
}
|
|
1595
|
-
)
|
|
1572
|
+
# Filter by retention_until for archival decisions
|
|
1573
|
+
# object_key = "#{event_data[:retention_until]}/#{event_data[:event_id]}.json"
|
|
1574
|
+
# ... PUT to object storage with WORM ...
|
|
1596
1575
|
end
|
|
1597
1576
|
|
|
1598
1577
|
private
|
|
1599
1578
|
|
|
1600
1579
|
def audit_object_key(event_data)
|
|
1601
|
-
|
|
1602
|
-
|
|
1603
|
-
hour = timestamp.hour
|
|
1604
|
-
|
|
1605
|
-
# Partition by date and hour for efficient queries
|
|
1606
|
-
"audit/#{date.strftime('%Y/%m/%d')}/hour=#{hour.to_s.rjust(2, '0')}/#{event_data[:event_id]}.json"
|
|
1580
|
+
ts = Time.parse(event_data[:timestamp])
|
|
1581
|
+
"audit/#{ts.strftime('%Y/%m/%d')}/#{event_data[:event_id]}.json"
|
|
1607
1582
|
end
|
|
1608
1583
|
end
|
|
1609
1584
|
end
|
|
@@ -1715,7 +1690,7 @@ end
|
|
|
1715
1690
|
|
|
1716
1691
|
## ⚡ Performance Guarantees
|
|
1717
1692
|
|
|
1718
|
-
> **Implementation:** See [ADR-006 Section 5.2: Cryptographic Signing](../ADR-006-security-compliance.md#52-cryptographic-signing) for detailed architecture.
|
|
1693
|
+
> **Implementation:** See [ADR-006 Section 5.2: Cryptographic Signing](../architecture/ADR-006-security-compliance.md#52-cryptographic-signing) for detailed architecture.
|
|
1719
1694
|
|
|
1720
1695
|
E11y audit trail is designed for **high-performance production environments** with strict SLOs. Audit events must not significantly impact application latency.
|
|
1721
1696
|
|
|
@@ -1880,10 +1855,10 @@ end
|
|
|
1880
1855
|
|-----------------|---------------------|------------|----------|
|
|
1881
1856
|
| **File (append-only)** | 1-2ms | 10,000/sec | Simple, local, fast |
|
|
1882
1857
|
| **PostgreSQL** | 2-5ms | 5,000/sec | Queryable, ACID |
|
|
1883
|
-
| **
|
|
1858
|
+
| **Object storage (WORM)** | 10-50ms | 1,000/sec | Cloud, immutable (external archival) |
|
|
1884
1859
|
| **Elasticsearch** | 5-10ms | 3,000/sec | Full-text search |
|
|
1885
1860
|
|
|
1886
|
-
**Recommendation:** Use **File adapter** for lowest latency, **PostgreSQL** for queryability,
|
|
1861
|
+
**Recommendation:** Use **File adapter** for lowest latency, **PostgreSQL** for queryability. For cloud WORM, use external archival (filter by `retention_until`).
|
|
1887
1862
|
|
|
1888
1863
|
---
|
|
1889
1864
|
|