mammoth 0.0.0 → 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 8226f7d0510693efd8d3c88623f346bcd24e88f80935c54581994d23b5a64767
4
- data.tar.gz: 69a7da12ff8b589851ea0a51d3a68e7c25a2521c49eaaf222afef0a020a5d524
3
+ metadata.gz: 137dbf7bc96183b45b2756dcb736884a66a45e920a884d35829f74518f14cef6
4
+ data.tar.gz: 0dd31f792456297c5ac2e6f16ef6842211a87de63824e2e6707c7bac00a68e59
5
5
  SHA512:
6
- metadata.gz: 4ecaf03600872e85d979e4c546e2e5f102dea80331ba216d18dd4d8df3c1c09d4c17148634f09b17f755cff86051e2205e290ffe1e4be1a760d9e5851237b179
7
- data.tar.gz: 4b8ade4da78b6a4db908fb909c7a11b6dad9572a0d8d4bf3fdb55a55d6b491adb61785b89888deea4c8c9e140cb9c48d78847a5190f4601a03228de4bd6ba577
6
+ metadata.gz: ff73c21bfee44d9b5c9d2f02535a0fdd2dadc9b86b868b8219305889b54c3ef50c8818092d6d60c65f96b8fa39097104e64cc7cc694af646bdd1208be128d134
7
+ data.tar.gz: 991cb45e7f3cbeb18afde25f41d0a1f1c4c3479f920c0551ca33ab2b93b266ad48a9bf258e46c79c68164c1deb7c493c5b252f616563d67f98b9df4c49057a08
data/CHANGELOG.md CHANGED
@@ -1,14 +1,70 @@
1
1
  # Changelog
2
2
 
3
- ## 0.1.0 - Unreleased
4
-
5
- - Rename product and gem from Echo to Mammoth.
6
- - Position Mammoth OSS as the reliable PostgreSQL change-event delivery appliance.
7
- - Add pgoutput-client / parser / decoder / source-adapter integration boundary.
8
- - Serialize CDC-core `ChangeEvent` shaped work into webhook payloads.
9
- - Flatten CDC-core `TransactionEnvelope` shaped work before sink delivery.
10
- - Add `mammoth start CONFIG` CLI command for live operation.
11
- - Add public Helm chart under `charts/mammoth`.
12
- - Add slim multi-stage Dockerfile for OSS image builds.
13
- - Add e2e test task and script using real HTTP, SQLite, and filesystem paths.
14
- - Switch OSS license metadata to MIT.
3
+ ## Unreleased
4
+
5
+
6
+ ## [0.1.0] - 2026-06-17
7
+
8
+ Initial public release of Mammoth.
9
+
10
+ ### Added
11
+
12
+ - PostgreSQL logical replication source integration via pgoutput.
13
+ - CDC event normalization through the CDC ecosystem.
14
+ - Webhook delivery destination.
15
+ - SQLite-backed operational state storage.
16
+ - Checkpoint persistence infrastructure.
17
+ - Dead-letter persistence infrastructure.
18
+ - Retry handling for failed webhook deliveries.
19
+ - Configuration validation command.
20
+ - Operational state bootstrap command.
21
+ - Operational status command.
22
+ - Sample event delivery command.
23
+ - Helm chart for Kubernetes deployments.
24
+ - Persistent volume support for SQLite operational state.
25
+ - Example configurations and runnable demonstrations.
26
+
27
+ ### Examples
28
+
29
+ - `examples/postgres_webhook`
30
+
31
+ - Demonstrates webhook delivery using sample CDC-shaped events.
32
+
33
+ - `examples/live_postgres_webhook`
34
+
35
+ - Demonstrates end-to-end PostgreSQL logical replication.
36
+ - Demonstrates replication slot management.
37
+ - Demonstrates webhook delivery from live database changes.
38
+
39
+ - `examples/operational_state`
40
+
41
+ - Demonstrates operational state bootstrap.
42
+ - Demonstrates checkpoint and dead-letter schema initialization.
43
+
44
+ - `examples/failing_webhook_retry`
45
+
46
+ - Demonstrates retry exhaustion behavior.
47
+ - Demonstrates durable dead-letter persistence.
48
+
49
+ - `examples/kubernetes_helm`
50
+
51
+ - Demonstrates Helm-based deployment.
52
+ - Demonstrates PVC-backed operational memory.
53
+
54
+ ### Validation
55
+
56
+ The 0.1.0 release was manually validated through:
57
+
58
+ - PostgreSQL logical replication slot creation and consumption.
59
+ - Idle replication connections exceeding one hour.
60
+ - Post-idle event delivery.
61
+ - Webhook delivery success path.
62
+ - Webhook failure and dead-letter path.
63
+ - SQLite checkpoint and dead-letter persistence.
64
+ - Helm chart rendering and installation.
65
+ - Kubernetes deployment on Kind.
66
+ - PVC-backed operational state storage.
67
+
68
+ ### Notes
69
+
70
+ Mammoth currently operates as a single active consumer per PostgreSQL logical replication slot. The default Helm deployment uses a single replica to align with PostgreSQL logical replication semantics.
data/README.md CHANGED
@@ -2,7 +2,7 @@
2
2
 
3
3
  [![Gem Version](https://badge.fury.io/rb/mammoth.svg)](https://badge.fury.io/rb/mammoth)
4
4
  [![CI](https://github.com/kanutocd/mammoth/workflows/CI/badge.svg)](https://github.com/kanutocd/mammoth/actions)
5
- [![Ruby Version](https://img.shields.io/badge/ruby-%3E%3D%203.4-ruby.svg)](https://www.ruby-lang.org/en/)
5
+ [![Ruby Version](https://img.shields.io/badge/ruby-%3E%3D%204.0-ruby.svg)](https://www.ruby-lang.org/en/)
6
6
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
7
7
 
8
8
  Mammoth is a self-hosted PostgreSQL event relay focused on reliable delivery
@@ -11,9 +11,7 @@ of database change events.
11
11
  ```text
12
12
  PostgreSQL
13
13
 
14
- pgoutput-client / parser / decoder
15
-
16
- Pgoutput::SourceAdapter::Cdc
14
+ CDC Ecosystem source adapter
17
15
 
18
16
  CDC::Core::ChangeEvent
19
17
 
@@ -39,7 +37,7 @@ Mammoth OSS includes:
39
37
  - webhook delivery sink
40
38
  - delivery worker with retry, checkpoint, and DLQ handling
41
39
  - CDC-core event serialization boundary
42
- - pgoutput-client / parser / decoder / source-adapter integration boundary
40
+ - CDC Ecosystem source-adapter integration boundary
43
41
  - Docker image support
44
42
  - public Helm chart support
45
43
  - unit and e2e test tasks
@@ -0,0 +1,39 @@
1
+ # yaml-language-server: $schema=./mammoth.schema.json
2
+
3
+ mammoth:
4
+ name: local_mammoth
5
+
6
+ postgres:
7
+ host: localhost
8
+ port: 5432
9
+ database: app_development
10
+ username: mammoth
11
+ password_env: MAMMOTH_POSTGRES_PASSWORD
12
+
13
+ replication:
14
+ slot: mammoth_prod
15
+ publications:
16
+ - mammoth_publication
17
+ auto_create_slot: false
18
+ temporary_slot: false
19
+ feedback_interval: 10.0
20
+
21
+ webhook:
22
+ name: primary_webhook
23
+ url: https://example.com/webhooks/postgres
24
+ timeout_seconds: 5
25
+
26
+ retry:
27
+ max_attempts: 5
28
+ schedule_seconds:
29
+ - 1
30
+ - 5
31
+ - 30
32
+ - 60
33
+ - 300
34
+
35
+ sqlite:
36
+ path: data/mammoth.db
37
+
38
+ logging:
39
+ level: info
@@ -0,0 +1,179 @@
1
+ {
2
+ "$schema": "http://json-schema.org/draft-04/schema#",
3
+ "id": "https://kanutocd.github.io/mammoth/schema/mammoth.schema.json",
4
+ "title": "Mammoth Configuration",
5
+ "type": "object",
6
+ "additionalProperties": false,
7
+
8
+ "required": [
9
+ "mammoth",
10
+ "postgres",
11
+ "replication",
12
+ "webhook",
13
+ "retry",
14
+ "sqlite",
15
+ "logging"
16
+ ],
17
+
18
+ "properties": {
19
+ "mammoth": {
20
+ "type": "object",
21
+ "additionalProperties": false,
22
+ "required": ["name"],
23
+ "properties": {
24
+ "name": {
25
+ "type": "string",
26
+ "minLength": 1
27
+ }
28
+ }
29
+ },
30
+
31
+ "postgres": {
32
+ "type": "object",
33
+ "additionalProperties": false,
34
+ "required": [
35
+ "host",
36
+ "port",
37
+ "database",
38
+ "username",
39
+ "password_env"
40
+ ],
41
+ "properties": {
42
+ "host": {
43
+ "type": "string",
44
+ "minLength": 1
45
+ },
46
+ "port": {
47
+ "type": "integer",
48
+ "minimum": 1,
49
+ "maximum": 65535
50
+ },
51
+ "database": {
52
+ "type": "string",
53
+ "minLength": 1
54
+ },
55
+ "username": {
56
+ "type": "string",
57
+ "minLength": 1
58
+ },
59
+ "password_env": {
60
+ "type": "string",
61
+ "minLength": 1
62
+ }
63
+ }
64
+ },
65
+
66
+ "replication": {
67
+ "type": "object",
68
+ "additionalProperties": false,
69
+ "required": [
70
+ "slot",
71
+ "publications"
72
+ ],
73
+ "properties": {
74
+ "slot": {
75
+ "type": "string",
76
+ "minLength": 1
77
+ },
78
+ "publications": {
79
+ "type": "array",
80
+ "minItems": 1,
81
+ "items": {
82
+ "type": "string",
83
+ "minLength": 1
84
+ }
85
+ },
86
+ "start_lsn": {
87
+ "type": ["string", "null"]
88
+ },
89
+ "auto_create_slot": {
90
+ "type": "boolean"
91
+ },
92
+ "temporary_slot": {
93
+ "type": "boolean"
94
+ },
95
+ "feedback_interval": {
96
+ "type": "number",
97
+ "minimum": 0,
98
+ "exclusiveMinimum": true
99
+ }
100
+ }
101
+ },
102
+
103
+ "webhook": {
104
+ "type": "object",
105
+ "additionalProperties": false,
106
+ "required": [
107
+ "name",
108
+ "url",
109
+ "timeout_seconds"
110
+ ],
111
+ "properties": {
112
+ "name": {
113
+ "type": "string",
114
+ "minLength": 1
115
+ },
116
+ "url": {
117
+ "type": "string",
118
+ "format": "uri"
119
+ },
120
+ "timeout_seconds": {
121
+ "type": "integer",
122
+ "minimum": 1
123
+ }
124
+ }
125
+ },
126
+
127
+ "retry": {
128
+ "type": "object",
129
+ "additionalProperties": false,
130
+ "required": [
131
+ "max_attempts",
132
+ "schedule_seconds"
133
+ ],
134
+ "properties": {
135
+ "max_attempts": {
136
+ "type": "integer",
137
+ "minimum": 1
138
+ },
139
+ "schedule_seconds": {
140
+ "type": "array",
141
+ "minItems": 1,
142
+ "items": {
143
+ "type": "integer",
144
+ "minimum": 1
145
+ }
146
+ }
147
+ }
148
+ },
149
+
150
+ "sqlite": {
151
+ "type": "object",
152
+ "additionalProperties": false,
153
+ "required": ["path"],
154
+ "properties": {
155
+ "path": {
156
+ "type": "string",
157
+ "minLength": 1
158
+ }
159
+ }
160
+ },
161
+
162
+ "logging": {
163
+ "type": "object",
164
+ "additionalProperties": false,
165
+ "required": ["level"],
166
+ "properties": {
167
+ "level": {
168
+ "type": "string",
169
+ "enum": [
170
+ "debug",
171
+ "info",
172
+ "warn",
173
+ "error"
174
+ ]
175
+ }
176
+ }
177
+ }
178
+ }
179
+ }
data/exe/mammoth ADDED
@@ -0,0 +1,6 @@
1
+ #!/usr/bin/env ruby
2
+ # frozen_string_literal: true
3
+
4
+ require "mammoth"
5
+
6
+ exit Mammoth::CLI.call(ARGV)
@@ -1,23 +1,25 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Mammoth
4
- # Top-level Mammoth application composition root.
4
+ # Top-level Mammoth application runtime.
5
5
  #
6
- # Application wires Mammoth's current v0.1.0 runtime pieces: configuration,
7
- # SQLite operational memory, replication boundary, delivery worker, checkpoint
8
- # store, dead letter store, and webhook sink.
6
+ # Application wires Mammoth's delivery-side runtime pieces: configuration,
7
+ # SQLite operational memory, replication consumer, delivery worker, checkpoint
8
+ # store, dead letter store, and webhook sink. Upstream PostgreSQL transport
9
+ # composition stays outside this class so the application runtime consumes an
10
+ # injected CDC work source rather than owning upstream CDC source-adapter
11
+ # lifecycle decisions.
9
12
  class Application
10
13
  attr_reader :config, :sqlite_store, :consumer, :delivery_worker
11
14
 
12
15
  # @param config [Mammoth::Configuration] loaded configuration
13
16
  # @param source [#each, nil] injectable event source for tests and demos
14
- # @param adapter [#call, nil] optional source adapter
15
17
  # @param sink [#deliver, nil] optional destination sink
16
18
  # @param sleeper [#call] retry sleep strategy
17
- def initialize(config, source: nil, adapter: nil, sink: nil, sleeper: Kernel.method(:sleep))
19
+ def initialize(config, source: nil, sink: nil, sleeper: Kernel.method(:sleep))
18
20
  @config = config
19
21
  @sqlite_store = SQLiteStore.connect(config.dig("sqlite", "path")).bootstrap!
20
- @consumer = ReplicationConsumer.new(config, source: source, adapter: adapter)
22
+ @consumer = ReplicationConsumer.new(source: source)
21
23
  @delivery_worker = build_delivery_worker(sink: sink || WebhookSink.from_config(config), sleeper: sleeper)
22
24
  end
23
25
 
@@ -0,0 +1,11 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Mammoth
4
+ # Backwards-compatible internal alias for the PostgreSQL CDC source.
5
+ #
6
+ # New code should use {Mammoth::Sources::Postgres}. Mammoth v0.1.0 keeps this
7
+ # constant so older tests or examples that referenced the transitional
8
+ # CdcSource name continue to work while the product-facing source name moves
9
+ # to PostgreSQL.
10
+ CdcSource = Sources::Postgres
11
+ end
data/lib/mammoth/cli.rb CHANGED
@@ -5,6 +5,7 @@ require "json"
5
5
  module Mammoth
6
6
  # Small command dispatcher for Mammoth's operator-facing CLI.
7
7
  class CLI
8
+ # Human-readable command usage printed for invalid or incomplete invocations.
8
9
  USAGE = [
9
10
  "Usage:",
10
11
  " mammoth version",
@@ -88,8 +89,9 @@ module Mammoth
88
89
  end
89
90
 
90
91
  def start
91
- processed = Application.new(load_config).start
92
- puts "Delivered events: #{processed}"
92
+ config = load_config
93
+ processed = Application.new(config, source: Sources::Postgres.new(config)).start
94
+ puts "Processed events: #{processed}"
93
95
  0
94
96
  end
95
97
 
@@ -101,7 +103,7 @@ module Mammoth
101
103
 
102
104
  event = JSON.parse(File.read(event_path))
103
105
  processed = Application.new(config, source: [event]).start
104
- puts "Delivered sample events: #{processed}"
106
+ puts "Processed sample events: #{processed}"
105
107
  0
106
108
  rescue JSON::ParserError => e
107
109
  raise ConfigurationError, "invalid event JSON in #{event_path}: #{e.message}"
@@ -10,7 +10,8 @@ module Mammoth
10
10
  # Configuration is intentionally schema-backed so the same contract can power
11
11
  # editor IntelliSense, preflight validation, and runtime startup checks.
12
12
  class Configuration
13
- DEFAULT_SCHEMA_PATH = File.expand_path("../../config/mammoth.schema.json", __dir__)
13
+ # Default JSON Schema used to validate Mammoth YAML configuration files.
14
+ DEFAULT_SCHEMA_PATH = File.expand_path("../../config/mammoth.schema.json", __dir__.to_s)
14
15
 
15
16
  attr_reader :path, :data, :schema_path
16
17
 
@@ -8,6 +8,7 @@ module Mammoth
8
8
  # after success, and persist the failed event to the dead letter queue after
9
9
  # retry exhaustion.
10
10
  class DeliveryWorker
11
+ # Default source name used when an event does not provide one.
11
12
  DEFAULT_SOURCE = "postgresql"
12
13
 
13
14
  attr_reader :sink, :checkpoint_store, :dead_letter_store, :retry_schedule, :max_attempts, :sleeper, :source_name,
@@ -50,7 +51,7 @@ module Mammoth
50
51
  dead_letter_store: dead_letter_store,
51
52
  source_name: config.dig("mammoth", "name"),
52
53
  slot_name: config.dig("replication", "slot"),
53
- publication_name: config.dig("replication", "publication"),
54
+ publication_name: Array(config.dig("replication", "publications")).join(","),
54
55
  max_attempts: config.dig("retry", "max_attempts"),
55
56
  retry_schedule: config.dig("retry", "schedule_seconds"),
56
57
  sleeper: sleeper
@@ -12,6 +12,7 @@ module Mammoth
12
12
  # from PostgreSQL-specific message shapes while preserving source metadata such
13
13
  # as commit LSN and transaction identity when available.
14
14
  class EventSerializer
15
+ # Default source label used in serialized webhook payloads.
15
16
  DEFAULT_SOURCE = "postgresql"
16
17
 
17
18
  # Serialize an event-like object into a webhook-ready Hash.
@@ -1,35 +1,18 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Mammoth
4
- # Consumes CDC-core work items from Mammoth's configured replication source.
4
+ # Consumes normalized CDC work from an injected source.
5
5
  #
6
- # ReplicationConsumer is the boundary between upstream CDC ingestion and
7
- # sink delivery. Live PostgreSQL ingestion is delegated to {PgoutputSource};
8
- # injected sources remain available for unit tests, demos, and e2e fixtures.
6
+ # ReplicationConsumer is intentionally upstream-agnostic. It does not know
7
+ # which upstream system produced the work. Its
8
+ # only job is to consume CDC Ecosystem work, flatten CDC transaction
9
+ # envelopes, and yield individual change events to the delivery pipeline.
9
10
  class ReplicationConsumer
10
- attr_reader :config, :source, :adapter
11
+ attr_reader :source
11
12
 
12
- # @param config [Mammoth::Configuration] loaded configuration
13
13
  # @param source [#each, nil] injectable CDC work stream
14
- # @param adapter [#call, nil] optional adapter for injected raw events
15
- def initialize(config, source: nil, adapter: nil)
16
- @config = config
14
+ def initialize(source: nil)
17
15
  @source = source
18
- @adapter = adapter
19
- end
20
-
21
- # Return the configured replication slot.
22
- #
23
- # @return [String]
24
- def slot
25
- config.dig("replication", "slot")
26
- end
27
-
28
- # Return the configured publication.
29
- #
30
- # @return [String]
31
- def publication
32
- config.dig("replication", "publication")
33
16
  end
34
17
 
35
18
  # Consume normalized CDC work from the configured source.
@@ -40,36 +23,52 @@ module Mammoth
40
23
  return enum_for(:start) unless block_given?
41
24
 
42
25
  count = 0
26
+
43
27
  each_event do |event|
44
28
  yield event
45
29
  count += 1
46
30
  end
31
+
47
32
  count
48
33
  end
49
34
 
50
35
  private
51
36
 
52
- def each_event
53
- effective_source.each do |raw_work|
54
- normalize(raw_work).each { |event| yield event }
37
+ def each_event(&block)
38
+ effective_source.each do |work|
39
+ flatten_cdc_work(work).each(&block)
55
40
  end
56
41
  end
57
42
 
58
43
  def effective_source
59
- source || PgoutputSource.new(config)
44
+ source || raise(ReplicationError, "replication source is not configured")
45
+ end
46
+
47
+ def flatten_cdc_work(work)
48
+ return [] if work.nil?
49
+ return validate_events(work.events) if transaction_envelope?(work)
50
+ return work.flat_map { |item| flatten_cdc_work(item) } if work.is_a?(Array)
51
+
52
+ validate_events([work])
60
53
  end
61
54
 
62
- def normalize(raw_work)
63
- adapted = adapter ? adapter.call(raw_work) : raw_work
64
- flatten_cdc_work(adapted)
55
+ def validate_events(events)
56
+ events.each { |event| validate_cdc_event!(event) }
65
57
  end
66
58
 
67
- def flatten_cdc_work(work)
68
- if transaction_envelope?(work)
69
- work.events
70
- else
71
- Array(work).flat_map { |item| transaction_envelope?(item) ? item.events : item }
72
- end
59
+ def validate_cdc_event!(event)
60
+ return event if event.respond_to?(:to_h) && cdc_event_hash?(event.to_h)
61
+
62
+ raise ReplicationError, "CDC source yielded non-CDC work: #{event.class}"
63
+ end
64
+
65
+ def cdc_event_hash?(event_hash)
66
+ return false unless event_hash.respond_to?(:key?)
67
+
68
+ has_operation = event_hash.key?("operation") || event_hash.key?(:operation)
69
+ has_position = event_hash.key?("source_position") || event_hash.key?(:source_position) ||
70
+ event_hash.key?("commit_lsn") || event_hash.key?(:commit_lsn)
71
+ has_operation && has_position
73
72
  end
74
73
 
75
74
  def transaction_envelope?(work)
@@ -0,0 +1,203 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Mammoth
4
+ module Sources
5
+ # Concrete PostgreSQL CDC source for Mammoth.
6
+ #
7
+ # Postgres realizes the CDC Ecosystem libraries for Mammoth's product
8
+ # boundary. It composes pgoutput-client, pgoutput-parser,
9
+ # pgoutput-decoder, and pgoutput-source-adapter into a single source that
10
+ # yields CDC::Core-shaped work to the delivery runtime.
11
+ #
12
+ # This class may mention pgoutput implementation details because it is the
13
+ # concrete PostgreSQL source adapter used by Mammoth. The rest of Mammoth
14
+ # should remain source-agnostic and consume only the work yielded here.
15
+ class Postgres
16
+ # @return [Mammoth::Configuration] loaded Mammoth configuration
17
+ attr_reader :config
18
+ # @return [#start, nil] injected pgoutput-client runner
19
+ attr_reader :runner
20
+ # @return [Object, nil] injected pgoutput protocol parser
21
+ attr_reader :parser
22
+ # @return [Object, nil] injected pgoutput decoder
23
+ attr_reader :decoder
24
+ # @return [Object, nil] injected CDC source adapter
25
+ attr_reader :adapter
26
+
27
+ # Build a PostgreSQL CDC source.
28
+ #
29
+ # @param config [Mammoth::Configuration] loaded configuration
30
+ # @param runner [#start, nil] injected pgoutput-client runner
31
+ # @param parser [Object, nil] injected pgoutput parser or relation tracker
32
+ # @param decoder [Object, nil] injected pgoutput decoder
33
+ # @param adapter [Object, nil] injected source adapter
34
+ def initialize(config, runner: nil, parser: nil, decoder: nil, adapter: nil)
35
+ @config = config
36
+ @runner = runner
37
+ @parser = parser
38
+ @decoder = decoder
39
+ @adapter = adapter
40
+ end
41
+
42
+ # Stream CDC::Core-shaped work from PostgreSQL logical replication.
43
+ #
44
+ # Calling this method starts the injected or configured pgoutput-client
45
+ # runner. The runner owns the PostgreSQL replication connection and slot
46
+ # lifecycle; this class only composes the parser, decoder, and adapter
47
+ # libraries around the stream.
48
+ #
49
+ # @yieldparam work [Object] CDC::Core::ChangeEvent or TransactionEnvelope
50
+ # @return [Enumerator, nil]
51
+ # @raise [Mammoth::ReplicationError] when the source cannot stream CDC work
52
+ def each(&block)
53
+ return enum_for(:each) unless block_given?
54
+
55
+ effective_runner.start do |payload, metadata = nil|
56
+ process_payload(payload, metadata, &block)
57
+ end
58
+ nil
59
+ rescue StandardError => e
60
+ raise e if e.is_a?(ReplicationError)
61
+
62
+ raise ReplicationError, "PostgreSQL CDC source failed: #{e.message}"
63
+ end
64
+
65
+ private
66
+
67
+ def process_payload(payload, metadata, &block)
68
+ parsed = parse_payload(payload)
69
+ decoded = decode_message(parsed, metadata)
70
+ normalize_decoded(decoded).each { |work| block.call(work) if work }
71
+ end
72
+
73
+ def parse_payload(payload)
74
+ parser = effective_parser
75
+ return parser.process(payload) if parser.respond_to?(:process)
76
+ return parser.parse(payload) if parser.respond_to?(:parse)
77
+ return parser.call(payload) if parser.respond_to?(:call)
78
+
79
+ raise ReplicationError, "pgoutput parser must respond to #process, #parse, or #call"
80
+ end
81
+
82
+ def decode_message(message, metadata)
83
+ decoder = effective_decoder
84
+ if decoder.respond_to?(:decode)
85
+ return callable_accepts_metadata?(decoder, :decode) ? decoder.decode(message, metadata) : decoder.decode(message)
86
+ end
87
+ if decoder.respond_to?(:call)
88
+ return callable_accepts_metadata?(decoder, :call) ? decoder.call(message, metadata) : decoder.call(message)
89
+ end
90
+
91
+ raise ReplicationError, "pgoutput decoder must respond to #decode or #call"
92
+ end
93
+
94
+ def normalize_decoded(decoded)
95
+ return [] if decoded.nil?
96
+ return decoded.flat_map { |item| normalize_decoded(item) } if decoded.is_a?(Array)
97
+
98
+ adapter = effective_adapter
99
+ result = if adapter.respond_to?(:normalize)
100
+ adapter.normalize(decoded)
101
+ elsif adapter.respond_to?(:call)
102
+ adapter.call(decoded)
103
+ else
104
+ raise ReplicationError, "pgoutput source adapter must respond to #normalize or #call"
105
+ end
106
+
107
+ result.is_a?(Array) ? result.compact : [result].compact
108
+ end
109
+
110
+ def effective_runner
111
+ runner || (@effective_runner ||= build_runner)
112
+ end
113
+
114
+ def effective_parser
115
+ parser || (@effective_parser ||= build_parser)
116
+ end
117
+
118
+ def effective_decoder
119
+ decoder || (@effective_decoder ||= build_decoder)
120
+ end
121
+
122
+ def effective_adapter
123
+ adapter || (@effective_adapter ||= build_adapter)
124
+ end
125
+
126
+ def build_runner
127
+ require_optional!("pgoutput/client", "pgoutput-client")
128
+
129
+ Pgoutput::Client::Runner.new(**runner_options)
130
+ end
131
+
132
+ def build_parser
133
+ require_optional!("pgoutput", "pgoutput-parser")
134
+
135
+ Pgoutput::RelationTracker.new
136
+ end
137
+
138
+ def build_decoder
139
+ require_optional!("pgoutput/decoder", "pgoutput-decoder")
140
+
141
+ Pgoutput::Decoder.new
142
+ end
143
+
144
+ def build_adapter
145
+ require_optional!("pgoutput/source_adapter", "pgoutput-source-adapter")
146
+
147
+ Pgoutput::SourceAdapter::Cdc.new
148
+ end
149
+
150
+ def runner_options
151
+ {
152
+ database_url: database_url,
153
+ slot_name: required_config("replication", "slot"),
154
+ publication_names: required_publications,
155
+ start_lsn: config.dig("replication", "start_lsn"),
156
+ auto_create_slot: !config.dig("replication", "auto_create_slot").nil?,
157
+ temporary_slot: !config.dig("replication", "temporary_slot").nil?
158
+ }.tap do |options|
159
+ feedback_interval = config.dig("replication", "feedback_interval")
160
+ options[:feedback_interval] = feedback_interval unless feedback_interval.nil?
161
+ end
162
+ end
163
+
164
+ def required_publications
165
+ publications = required_config("replication", "publications")
166
+ unless publications.is_a?(Array) && publications.any? &&
167
+ publications.all? { |publication| publication.is_a?(String) && !publication.empty? }
168
+ raise ReplicationError, "missing PostgreSQL source config: replication.publications"
169
+ end
170
+
171
+ publications
172
+ end
173
+
174
+ def callable_accepts_metadata?(object, method_name)
175
+ arity = object.respond_to?(:arity) && method_name == :call ? object.arity : object.method(method_name).arity
176
+ arity != 1
177
+ end
178
+
179
+ def database_url
180
+ password = ENV.fetch(required_config("postgres", "password_env"), nil)
181
+ credentials = required_config("postgres", "username").dup
182
+ credentials << ":#{password}" if password
183
+
184
+ "postgres://#{credentials}@#{required_config("postgres", "host")}:#{required_config("postgres", "port")}/" \
185
+ "#{required_config("postgres", "database")}"
186
+ end
187
+
188
+ def required_config(*keys)
189
+ value = config.dig(*keys)
190
+ raise ReplicationError, "missing PostgreSQL source config: #{keys.join(".")}" if value.nil? || value == ""
191
+
192
+ value
193
+ end
194
+
195
+ def require_optional!(feature, gem_name)
196
+ require feature
197
+ true
198
+ rescue LoadError => e
199
+ raise ReplicationError, "#{gem_name} is required for PostgreSQL CDC source integration: #{e.message}"
200
+ end
201
+ end
202
+ end
203
+ end
@@ -11,10 +11,15 @@ module Mammoth
11
11
  # inspectable state required for reliability: schema versions, checkpoints,
12
12
  # and dead letters.
13
13
  class SQLiteStore
14
- DEFAULT_DB_PATH = File.expand_path("../../.sqlite3/mammoth.db", __dir__)
15
- MIGRATION_DIR = File.expand_path("sql", __dir__)
14
+ # Default SQLite database path used by local Mammoth runs.
15
+ DEFAULT_DB_PATH = File.expand_path("../../.sqlite3/mammoth.db", __dir__.to_s)
16
+ # Directory containing bundled SQLite schema migration files.
17
+ MIGRATION_DIR = File.expand_path("sql", __dir__.to_s)
18
+ # Initial schema migration file applied to new SQLite stores.
16
19
  BOOTSTRAP_FILE = "__bootstrap__.sql"
20
+ # Synthetic schema version recorded after the bootstrap migration succeeds.
17
21
  BOOTSTRAP_VERSION = "__bootstrap__"
22
+ # Table that records applied SQLite schema migrations.
18
23
  MIGRATIONS_TABLE = "schema_migrations"
19
24
 
20
25
  attr_reader :path
@@ -26,6 +26,8 @@ module Mammoth
26
26
  # @return [void]
27
27
  def call
28
28
  puts "Mammoth: #{config.dig("mammoth", "name")}"
29
+ puts "Replication slot: #{config.dig("replication", "slot")}"
30
+ puts "Replication publications: #{Array(config.dig("replication", "publications")).join(", ")}"
29
31
  puts "Runtime: not started"
30
32
  puts "SQLite: #{sqlite_path}"
31
33
  puts "Webhook: #{config.dig("webhook", "name")}"
@@ -1,5 +1,6 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Mammoth
4
- VERSION = "0.0.0"
4
+ # Current Mammoth gem version.
5
+ VERSION = "0.1.0"
5
6
  end
@@ -2,11 +2,13 @@
2
2
 
3
3
  require "json"
4
4
  require "net/http"
5
+ require "socket"
5
6
  require "uri"
6
7
 
7
8
  module Mammoth
8
9
  # Delivers normalized Mammoth events to a webhook endpoint.
9
10
  class WebhookSink
11
+ # HTTP status range treated as successful webhook delivery.
10
12
  SUCCESS_RANGE = 200..299
11
13
 
12
14
  attr_reader :name, :url, :timeout_seconds
data/lib/mammoth.rb CHANGED
@@ -10,7 +10,8 @@ require_relative "mammoth/dead_letter_store"
10
10
  require_relative "mammoth/event_serializer"
11
11
  require_relative "mammoth/webhook_sink"
12
12
  require_relative "mammoth/delivery_worker"
13
- require_relative "mammoth/pgoutput_source"
13
+ require_relative "mammoth/sources/postgres"
14
+ require_relative "mammoth/cdc_source"
14
15
  require_relative "mammoth/replication_consumer"
15
16
  require_relative "mammoth/application"
16
17
  require_relative "mammoth/cli"
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: mammoth
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.0
4
+ version: 0.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ken C. Demanawa
@@ -10,63 +10,63 @@ cert_chain: []
10
10
  date: 1980-01-02 00:00:00.000000000 Z
11
11
  dependencies:
12
12
  - !ruby/object:Gem::Dependency
13
- name: json-schema
13
+ name: cdc-core
14
14
  requirement: !ruby/object:Gem::Requirement
15
15
  requirements:
16
16
  - - "~>"
17
17
  - !ruby/object:Gem::Version
18
- version: '6.2'
18
+ version: '0.1'
19
19
  type: :runtime
20
20
  prerelease: false
21
21
  version_requirements: !ruby/object:Gem::Requirement
22
22
  requirements:
23
23
  - - "~>"
24
24
  - !ruby/object:Gem::Version
25
- version: '6.2'
25
+ version: '0.1'
26
26
  - !ruby/object:Gem::Dependency
27
- name: sqlite3
27
+ name: json-schema
28
28
  requirement: !ruby/object:Gem::Requirement
29
29
  requirements:
30
30
  - - "~>"
31
31
  - !ruby/object:Gem::Version
32
- version: '2.9'
32
+ version: '6.2'
33
33
  type: :runtime
34
34
  prerelease: false
35
35
  version_requirements: !ruby/object:Gem::Requirement
36
36
  requirements:
37
37
  - - "~>"
38
38
  - !ruby/object:Gem::Version
39
- version: '2.9'
39
+ version: '6.2'
40
40
  - !ruby/object:Gem::Dependency
41
- name: cdc-core
41
+ name: pgoutput-client
42
42
  requirement: !ruby/object:Gem::Requirement
43
43
  requirements:
44
44
  - - "~>"
45
45
  - !ruby/object:Gem::Version
46
- version: '0.1'
46
+ version: '0.2'
47
47
  type: :runtime
48
48
  prerelease: false
49
49
  version_requirements: !ruby/object:Gem::Requirement
50
50
  requirements:
51
51
  - - "~>"
52
52
  - !ruby/object:Gem::Version
53
- version: '0.1'
53
+ version: '0.2'
54
54
  - !ruby/object:Gem::Dependency
55
- name: pgoutput-client
55
+ name: pgoutput-decoder
56
56
  requirement: !ruby/object:Gem::Requirement
57
57
  requirements:
58
58
  - - "~>"
59
59
  - !ruby/object:Gem::Version
60
- version: '0.2'
60
+ version: '0.1'
61
61
  type: :runtime
62
62
  prerelease: false
63
63
  version_requirements: !ruby/object:Gem::Requirement
64
64
  requirements:
65
65
  - - "~>"
66
66
  - !ruby/object:Gem::Version
67
- version: '0.2'
67
+ version: '0.1'
68
68
  - !ruby/object:Gem::Dependency
69
- name: pgoutput-decoder
69
+ name: pgoutput-parser
70
70
  requirement: !ruby/object:Gem::Requirement
71
71
  requirements:
72
72
  - - "~>"
@@ -80,7 +80,7 @@ dependencies:
80
80
  - !ruby/object:Gem::Version
81
81
  version: '0.1'
82
82
  - !ruby/object:Gem::Dependency
83
- name: pgoutput-parser
83
+ name: pgoutput-source-adapter
84
84
  requirement: !ruby/object:Gem::Requirement
85
85
  requirements:
86
86
  - - "~>"
@@ -94,39 +94,44 @@ dependencies:
94
94
  - !ruby/object:Gem::Version
95
95
  version: '0.1'
96
96
  - !ruby/object:Gem::Dependency
97
- name: pgoutput-source-adapter
97
+ name: sqlite3
98
98
  requirement: !ruby/object:Gem::Requirement
99
99
  requirements:
100
100
  - - "~>"
101
101
  - !ruby/object:Gem::Version
102
- version: '0.1'
102
+ version: '2.9'
103
103
  type: :runtime
104
104
  prerelease: false
105
105
  version_requirements: !ruby/object:Gem::Requirement
106
106
  requirements:
107
107
  - - "~>"
108
108
  - !ruby/object:Gem::Version
109
- version: '0.1'
109
+ version: '2.9'
110
110
  description: |
111
111
  Mammoth is an OSS PostgreSQL change-event delivery appliance for Ruby.
112
112
 
113
- It consumes CDC-shaped events produced from the pgoutput family of gems and
114
- cdc-core, then delivers those changes to webhook endpoints with durable
113
+ It realizes the CDC Ecosystem pgoutput and cdc-core libraries for PostgreSQL,
114
+ then delivers normalized changes to webhook endpoints with durable
115
115
  checkpointing, retry state, dead letters, and operational visibility.
116
116
 
117
117
  Mammoth is application-first: it can be installed as a Ruby gem, packaged
118
118
  into a container image, or deployed into Kubernetes with Helm.
119
119
  email:
120
120
  - kenneth.c.demanawa@gmail.com
121
- executables: []
121
+ executables:
122
+ - mammoth
122
123
  extensions: []
123
124
  extra_rdoc_files: []
124
125
  files:
125
126
  - CHANGELOG.md
126
127
  - LICENSE.txt
127
128
  - README.md
129
+ - config/mammoth.example.yml
130
+ - config/mammoth.schema.json
131
+ - exe/mammoth
128
132
  - lib/mammoth.rb
129
133
  - lib/mammoth/application.rb
134
+ - lib/mammoth/cdc_source.rb
130
135
  - lib/mammoth/checkpoint_store.rb
131
136
  - lib/mammoth/cli.rb
132
137
  - lib/mammoth/configuration.rb
@@ -134,8 +139,8 @@ files:
134
139
  - lib/mammoth/delivery_worker.rb
135
140
  - lib/mammoth/errors.rb
136
141
  - lib/mammoth/event_serializer.rb
137
- - lib/mammoth/pgoutput_source.rb
138
142
  - lib/mammoth/replication_consumer.rb
143
+ - lib/mammoth/sources/postgres.rb
139
144
  - lib/mammoth/sql/__bootstrap__.sql
140
145
  - lib/mammoth/sqlite_store.rb
141
146
  - lib/mammoth/status.rb
@@ -1,166 +0,0 @@
1
- # frozen_string_literal: true
2
-
3
- module Mammoth
4
- # Streams PostgreSQL logical replication through the CDC Ecosystem boundary.
5
- #
6
- # PgoutputSource is Mammoth's upstream integration point. It composes the
7
- # standalone pgoutput transport, parser, decoder, and source-adapter gems so
8
- # the rest of Mammoth only receives CDC-core domain objects. Transport
9
- # resiliency remains owned by pgoutput-client; Mammoth owns delivery.
10
- class PgoutputSource
11
- # @return [Mammoth::Configuration] loaded Mammoth configuration
12
- attr_reader :config
13
- # @return [Object, nil] pgoutput-client compatible runner
14
- attr_reader :runner
15
- # @return [Object, nil] pgoutput-parser compatible parser
16
- attr_reader :parser
17
- # @return [Object, nil] pgoutput-decoder compatible decoder
18
- attr_reader :decoder
19
- # @return [Object, nil] CDC source adapter
20
- attr_reader :source_adapter
21
-
22
- # Build the pgoutput integration source.
23
- #
24
- # @param config [Mammoth::Configuration] loaded configuration
25
- # @param runner [Object, nil] injectable pgoutput-client runner
26
- # @param parser [Object, nil] injectable pgoutput parser
27
- # @param decoder [Object, nil] injectable pgoutput decoder
28
- # @param source_adapter [Object, nil] injectable CDC source adapter
29
- def initialize(config, runner: nil, parser: nil, decoder: nil, source_adapter: nil)
30
- @config = config
31
- @runner = runner
32
- @parser = parser
33
- @decoder = decoder
34
- @source_adapter = source_adapter
35
- end
36
-
37
- # Stream CDC-core objects from PostgreSQL.
38
- #
39
- # @yieldparam work [Object] CDC::Core::ChangeEvent or TransactionEnvelope
40
- # @return [void]
41
- # @raise [Mammoth::ReplicationError] when required CDC components are unavailable
42
- def each
43
- return enum_for(:each) unless block_given?
44
-
45
- effective_runner.start do |payload, metadata|
46
- normalized_items(payload, metadata).each { |item| yield item }
47
- end
48
- end
49
-
50
- private
51
-
52
- def normalized_items(payload, metadata)
53
- decoded = effective_decoder ? invoke_component(effective_decoder, parsed_payload(payload), metadata) : parsed_payload(payload)
54
- normalized = invoke_source_adapter(decoded, metadata)
55
- Array(normalized).flatten
56
- end
57
-
58
- def parsed_payload(payload)
59
- return payload unless effective_parser
60
-
61
- invoke_component(effective_parser, payload)
62
- end
63
-
64
- def invoke_source_adapter(decoded, metadata)
65
- adapter = effective_source_adapter
66
- if adapter.respond_to?(:normalize)
67
- adapter.normalize(decoded)
68
- elsif adapter.respond_to?(:call)
69
- adapter.call(decoded, metadata)
70
- else
71
- raise ReplicationError, "pgoutput source adapter must respond to #normalize or #call"
72
- end
73
- end
74
-
75
- def invoke_component(component, *args)
76
- if component.respond_to?(:call)
77
- component.call(*args)
78
- elsif component.respond_to?(:parse)
79
- component.parse(*args)
80
- elsif component.respond_to?(:decode)
81
- component.decode(*args)
82
- else
83
- raise ReplicationError, "#{component.class} must respond to #call, #parse, or #decode"
84
- end
85
- end
86
-
87
- def effective_runner
88
- @runner ||= build_runner
89
- end
90
-
91
- def effective_parser
92
- @parser ||= build_parser
93
- end
94
-
95
- def effective_decoder
96
- @decoder ||= build_decoder
97
- end
98
-
99
- def effective_source_adapter
100
- @source_adapter ||= build_source_adapter
101
- end
102
-
103
- def build_runner
104
- require_optional!("pgoutput_client", "pgoutput-client")
105
- Pgoutput::Client::Runner.new(
106
- database_url: database_url,
107
- slot_name: config.dig("replication", "slot"),
108
- publication_names: [config.dig("replication", "publication")],
109
- start_lsn: config.dig("replication", "start_lsn"),
110
- auto_create_slot: config.dig("replication", "auto_create_slot") || false
111
- )
112
- end
113
-
114
- def build_parser
115
- require_any!(["pgoutput_parser", "pgoutput/parser"], "pgoutput-parser")
116
- constant_or_nil("Pgoutput::Parser") || constant_or_nil("Pgoutput::Parser::Parser")
117
- end
118
-
119
- def build_decoder
120
- require_any!(["pgoutput_decoder", "pgoutput/decoder"], "pgoutput-decoder")
121
- constant_or_nil("Pgoutput::Decoder") || constant_or_nil("Pgoutput::Decoder::ValueDecoder")
122
- end
123
-
124
- def build_source_adapter
125
- require_optional!("cdc_core", "cdc-core")
126
- require_any!(["pgoutput_source_adapter", "pgoutput/source_adapter/cdc"], "pgoutput-source-adapter")
127
-
128
- adapter_class = constant_or_nil("Pgoutput::SourceAdapter::Cdc")
129
- raise ReplicationError, "Pgoutput::SourceAdapter::Cdc is unavailable" unless adapter_class
130
-
131
- adapter_class.new
132
- end
133
-
134
- def database_url
135
- password = ENV.fetch(config.dig("postgres", "password_env"), "")
136
- user = config.dig("postgres", "username")
137
- host = config.dig("postgres", "host")
138
- port = config.dig("postgres", "port")
139
- database = config.dig("postgres", "database")
140
- "postgres://#{user}:#{password}@#{host}:#{port}/#{database}"
141
- end
142
-
143
- def require_optional!(feature, gem_name)
144
- require feature
145
- rescue LoadError => e
146
- raise ReplicationError, "#{gem_name} is required for live pgoutput replication: #{e.message}"
147
- end
148
-
149
- def require_any!(features, gem_name)
150
- errors = []
151
- features.each do |feature|
152
- require feature
153
- return true
154
- rescue LoadError => e
155
- errors << e.message
156
- end
157
- raise ReplicationError, "#{gem_name} is required for live pgoutput replication: #{errors.join("; ")}"
158
- end
159
-
160
- def constant_or_nil(name)
161
- name.split("::").reduce(Object) { |scope, const_name| scope.const_get(const_name, false) }
162
- rescue NameError
163
- nil
164
- end
165
- end
166
- end