mammoth 0.0.0 → 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +68 -12
- data/README.md +3 -5
- data/config/mammoth.example.yml +39 -0
- data/config/mammoth.schema.json +179 -0
- data/exe/mammoth +6 -0
- data/lib/mammoth/application.rb +9 -7
- data/lib/mammoth/cdc_source.rb +11 -0
- data/lib/mammoth/cli.rb +5 -3
- data/lib/mammoth/configuration.rb +2 -1
- data/lib/mammoth/delivery_worker.rb +2 -1
- data/lib/mammoth/event_serializer.rb +1 -0
- data/lib/mammoth/replication_consumer.rb +36 -37
- data/lib/mammoth/sources/postgres.rb +203 -0
- data/lib/mammoth/sqlite_store.rb +7 -2
- data/lib/mammoth/status.rb +2 -0
- data/lib/mammoth/version.rb +2 -1
- data/lib/mammoth/webhook_sink.rb +2 -0
- data/lib/mammoth.rb +2 -1
- metadata +27 -22
- data/lib/mammoth/pgoutput_source.rb +0 -166
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 137dbf7bc96183b45b2756dcb736884a66a45e920a884d35829f74518f14cef6
|
|
4
|
+
data.tar.gz: 0dd31f792456297c5ac2e6f16ef6842211a87de63824e2e6707c7bac00a68e59
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: ff73c21bfee44d9b5c9d2f02535a0fdd2dadc9b86b868b8219305889b54c3ef50c8818092d6d60c65f96b8fa39097104e64cc7cc694af646bdd1208be128d134
|
|
7
|
+
data.tar.gz: 991cb45e7f3cbeb18afde25f41d0a1f1c4c3479f920c0551ca33ab2b93b266ad48a9bf258e46c79c68164c1deb7c493c5b252f616563d67f98b9df4c49057a08
|
data/CHANGELOG.md
CHANGED
|
@@ -1,14 +1,70 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
-
##
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
-
|
|
13
|
-
-
|
|
14
|
-
-
|
|
3
|
+
## Unreleased
|
|
4
|
+
|
|
5
|
+
|
|
6
|
+
## [0.1.0] - 2026-06-17
|
|
7
|
+
|
|
8
|
+
Initial public release of Mammoth.
|
|
9
|
+
|
|
10
|
+
### Added
|
|
11
|
+
|
|
12
|
+
- PostgreSQL logical replication source integration via pgoutput.
|
|
13
|
+
- CDC event normalization through the CDC ecosystem.
|
|
14
|
+
- Webhook delivery destination.
|
|
15
|
+
- SQLite-backed operational state storage.
|
|
16
|
+
- Checkpoint persistence infrastructure.
|
|
17
|
+
- Dead-letter persistence infrastructure.
|
|
18
|
+
- Retry handling for failed webhook deliveries.
|
|
19
|
+
- Configuration validation command.
|
|
20
|
+
- Operational state bootstrap command.
|
|
21
|
+
- Operational status command.
|
|
22
|
+
- Sample event delivery command.
|
|
23
|
+
- Helm chart for Kubernetes deployments.
|
|
24
|
+
- Persistent volume support for SQLite operational state.
|
|
25
|
+
- Example configurations and runnable demonstrations.
|
|
26
|
+
|
|
27
|
+
### Examples
|
|
28
|
+
|
|
29
|
+
- `examples/postgres_webhook`
|
|
30
|
+
|
|
31
|
+
- Demonstrates webhook delivery using sample CDC-shaped events.
|
|
32
|
+
|
|
33
|
+
- `examples/live_postgres_webhook`
|
|
34
|
+
|
|
35
|
+
- Demonstrates end-to-end PostgreSQL logical replication.
|
|
36
|
+
- Demonstrates replication slot management.
|
|
37
|
+
- Demonstrates webhook delivery from live database changes.
|
|
38
|
+
|
|
39
|
+
- `examples/operational_state`
|
|
40
|
+
|
|
41
|
+
- Demonstrates operational state bootstrap.
|
|
42
|
+
- Demonstrates checkpoint and dead-letter schema initialization.
|
|
43
|
+
|
|
44
|
+
- `examples/failing_webhook_retry`
|
|
45
|
+
|
|
46
|
+
- Demonstrates retry exhaustion behavior.
|
|
47
|
+
- Demonstrates durable dead-letter persistence.
|
|
48
|
+
|
|
49
|
+
- `examples/kubernetes_helm`
|
|
50
|
+
|
|
51
|
+
- Demonstrates Helm-based deployment.
|
|
52
|
+
- Demonstrates PVC-backed operational memory.
|
|
53
|
+
|
|
54
|
+
### Validation
|
|
55
|
+
|
|
56
|
+
The 0.1.0 release was manually validated through:
|
|
57
|
+
|
|
58
|
+
- PostgreSQL logical replication slot creation and consumption.
|
|
59
|
+
- Idle replication connections exceeding one hour.
|
|
60
|
+
- Post-idle event delivery.
|
|
61
|
+
- Webhook delivery success path.
|
|
62
|
+
- Webhook failure and dead-letter path.
|
|
63
|
+
- SQLite checkpoint and dead-letter persistence.
|
|
64
|
+
- Helm chart rendering and installation.
|
|
65
|
+
- Kubernetes deployment on Kind.
|
|
66
|
+
- PVC-backed operational state storage.
|
|
67
|
+
|
|
68
|
+
### Notes
|
|
69
|
+
|
|
70
|
+
Mammoth currently operates as a single active consumer per PostgreSQL logical replication slot. The default Helm deployment uses a single replica to align with PostgreSQL logical replication semantics.
|
data/README.md
CHANGED
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
|
|
3
3
|
[](https://badge.fury.io/rb/mammoth)
|
|
4
4
|
[](https://github.com/kanutocd/mammoth/actions)
|
|
5
|
-
[](https://www.ruby-lang.org/en/)
|
|
6
6
|
[](https://opensource.org/licenses/MIT)
|
|
7
7
|
|
|
8
8
|
Mammoth is a self-hosted PostgreSQL event relay focused on reliable delivery
|
|
@@ -11,9 +11,7 @@ of database change events.
|
|
|
11
11
|
```text
|
|
12
12
|
PostgreSQL
|
|
13
13
|
↓
|
|
14
|
-
|
|
15
|
-
↓
|
|
16
|
-
Pgoutput::SourceAdapter::Cdc
|
|
14
|
+
CDC Ecosystem source adapter
|
|
17
15
|
↓
|
|
18
16
|
CDC::Core::ChangeEvent
|
|
19
17
|
↓
|
|
@@ -39,7 +37,7 @@ Mammoth OSS includes:
|
|
|
39
37
|
- webhook delivery sink
|
|
40
38
|
- delivery worker with retry, checkpoint, and DLQ handling
|
|
41
39
|
- CDC-core event serialization boundary
|
|
42
|
-
-
|
|
40
|
+
- CDC Ecosystem source-adapter integration boundary
|
|
43
41
|
- Docker image support
|
|
44
42
|
- public Helm chart support
|
|
45
43
|
- unit and e2e test tasks
|
|
@@ -0,0 +1,39 @@
|
|
|
1
|
+
# yaml-language-server: $schema=./mammoth.schema.json
|
|
2
|
+
|
|
3
|
+
mammoth:
|
|
4
|
+
name: local_mammoth
|
|
5
|
+
|
|
6
|
+
postgres:
|
|
7
|
+
host: localhost
|
|
8
|
+
port: 5432
|
|
9
|
+
database: app_development
|
|
10
|
+
username: mammoth
|
|
11
|
+
password_env: MAMMOTH_POSTGRES_PASSWORD
|
|
12
|
+
|
|
13
|
+
replication:
|
|
14
|
+
slot: mammoth_prod
|
|
15
|
+
publications:
|
|
16
|
+
- mammoth_publication
|
|
17
|
+
auto_create_slot: false
|
|
18
|
+
temporary_slot: false
|
|
19
|
+
feedback_interval: 10.0
|
|
20
|
+
|
|
21
|
+
webhook:
|
|
22
|
+
name: primary_webhook
|
|
23
|
+
url: https://example.com/webhooks/postgres
|
|
24
|
+
timeout_seconds: 5
|
|
25
|
+
|
|
26
|
+
retry:
|
|
27
|
+
max_attempts: 5
|
|
28
|
+
schedule_seconds:
|
|
29
|
+
- 1
|
|
30
|
+
- 5
|
|
31
|
+
- 30
|
|
32
|
+
- 60
|
|
33
|
+
- 300
|
|
34
|
+
|
|
35
|
+
sqlite:
|
|
36
|
+
path: data/mammoth.db
|
|
37
|
+
|
|
38
|
+
logging:
|
|
39
|
+
level: info
|
|
@@ -0,0 +1,179 @@
|
|
|
1
|
+
{
|
|
2
|
+
"$schema": "http://json-schema.org/draft-04/schema#",
|
|
3
|
+
"id": "https://kanutocd.github.io/mammoth/schema/mammoth.schema.json",
|
|
4
|
+
"title": "Mammoth Configuration",
|
|
5
|
+
"type": "object",
|
|
6
|
+
"additionalProperties": false,
|
|
7
|
+
|
|
8
|
+
"required": [
|
|
9
|
+
"mammoth",
|
|
10
|
+
"postgres",
|
|
11
|
+
"replication",
|
|
12
|
+
"webhook",
|
|
13
|
+
"retry",
|
|
14
|
+
"sqlite",
|
|
15
|
+
"logging"
|
|
16
|
+
],
|
|
17
|
+
|
|
18
|
+
"properties": {
|
|
19
|
+
"mammoth": {
|
|
20
|
+
"type": "object",
|
|
21
|
+
"additionalProperties": false,
|
|
22
|
+
"required": ["name"],
|
|
23
|
+
"properties": {
|
|
24
|
+
"name": {
|
|
25
|
+
"type": "string",
|
|
26
|
+
"minLength": 1
|
|
27
|
+
}
|
|
28
|
+
}
|
|
29
|
+
},
|
|
30
|
+
|
|
31
|
+
"postgres": {
|
|
32
|
+
"type": "object",
|
|
33
|
+
"additionalProperties": false,
|
|
34
|
+
"required": [
|
|
35
|
+
"host",
|
|
36
|
+
"port",
|
|
37
|
+
"database",
|
|
38
|
+
"username",
|
|
39
|
+
"password_env"
|
|
40
|
+
],
|
|
41
|
+
"properties": {
|
|
42
|
+
"host": {
|
|
43
|
+
"type": "string",
|
|
44
|
+
"minLength": 1
|
|
45
|
+
},
|
|
46
|
+
"port": {
|
|
47
|
+
"type": "integer",
|
|
48
|
+
"minimum": 1,
|
|
49
|
+
"maximum": 65535
|
|
50
|
+
},
|
|
51
|
+
"database": {
|
|
52
|
+
"type": "string",
|
|
53
|
+
"minLength": 1
|
|
54
|
+
},
|
|
55
|
+
"username": {
|
|
56
|
+
"type": "string",
|
|
57
|
+
"minLength": 1
|
|
58
|
+
},
|
|
59
|
+
"password_env": {
|
|
60
|
+
"type": "string",
|
|
61
|
+
"minLength": 1
|
|
62
|
+
}
|
|
63
|
+
}
|
|
64
|
+
},
|
|
65
|
+
|
|
66
|
+
"replication": {
|
|
67
|
+
"type": "object",
|
|
68
|
+
"additionalProperties": false,
|
|
69
|
+
"required": [
|
|
70
|
+
"slot",
|
|
71
|
+
"publications"
|
|
72
|
+
],
|
|
73
|
+
"properties": {
|
|
74
|
+
"slot": {
|
|
75
|
+
"type": "string",
|
|
76
|
+
"minLength": 1
|
|
77
|
+
},
|
|
78
|
+
"publications": {
|
|
79
|
+
"type": "array",
|
|
80
|
+
"minItems": 1,
|
|
81
|
+
"items": {
|
|
82
|
+
"type": "string",
|
|
83
|
+
"minLength": 1
|
|
84
|
+
}
|
|
85
|
+
},
|
|
86
|
+
"start_lsn": {
|
|
87
|
+
"type": ["string", "null"]
|
|
88
|
+
},
|
|
89
|
+
"auto_create_slot": {
|
|
90
|
+
"type": "boolean"
|
|
91
|
+
},
|
|
92
|
+
"temporary_slot": {
|
|
93
|
+
"type": "boolean"
|
|
94
|
+
},
|
|
95
|
+
"feedback_interval": {
|
|
96
|
+
"type": "number",
|
|
97
|
+
"minimum": 0,
|
|
98
|
+
"exclusiveMinimum": true
|
|
99
|
+
}
|
|
100
|
+
}
|
|
101
|
+
},
|
|
102
|
+
|
|
103
|
+
"webhook": {
|
|
104
|
+
"type": "object",
|
|
105
|
+
"additionalProperties": false,
|
|
106
|
+
"required": [
|
|
107
|
+
"name",
|
|
108
|
+
"url",
|
|
109
|
+
"timeout_seconds"
|
|
110
|
+
],
|
|
111
|
+
"properties": {
|
|
112
|
+
"name": {
|
|
113
|
+
"type": "string",
|
|
114
|
+
"minLength": 1
|
|
115
|
+
},
|
|
116
|
+
"url": {
|
|
117
|
+
"type": "string",
|
|
118
|
+
"format": "uri"
|
|
119
|
+
},
|
|
120
|
+
"timeout_seconds": {
|
|
121
|
+
"type": "integer",
|
|
122
|
+
"minimum": 1
|
|
123
|
+
}
|
|
124
|
+
}
|
|
125
|
+
},
|
|
126
|
+
|
|
127
|
+
"retry": {
|
|
128
|
+
"type": "object",
|
|
129
|
+
"additionalProperties": false,
|
|
130
|
+
"required": [
|
|
131
|
+
"max_attempts",
|
|
132
|
+
"schedule_seconds"
|
|
133
|
+
],
|
|
134
|
+
"properties": {
|
|
135
|
+
"max_attempts": {
|
|
136
|
+
"type": "integer",
|
|
137
|
+
"minimum": 1
|
|
138
|
+
},
|
|
139
|
+
"schedule_seconds": {
|
|
140
|
+
"type": "array",
|
|
141
|
+
"minItems": 1,
|
|
142
|
+
"items": {
|
|
143
|
+
"type": "integer",
|
|
144
|
+
"minimum": 1
|
|
145
|
+
}
|
|
146
|
+
}
|
|
147
|
+
}
|
|
148
|
+
},
|
|
149
|
+
|
|
150
|
+
"sqlite": {
|
|
151
|
+
"type": "object",
|
|
152
|
+
"additionalProperties": false,
|
|
153
|
+
"required": ["path"],
|
|
154
|
+
"properties": {
|
|
155
|
+
"path": {
|
|
156
|
+
"type": "string",
|
|
157
|
+
"minLength": 1
|
|
158
|
+
}
|
|
159
|
+
}
|
|
160
|
+
},
|
|
161
|
+
|
|
162
|
+
"logging": {
|
|
163
|
+
"type": "object",
|
|
164
|
+
"additionalProperties": false,
|
|
165
|
+
"required": ["level"],
|
|
166
|
+
"properties": {
|
|
167
|
+
"level": {
|
|
168
|
+
"type": "string",
|
|
169
|
+
"enum": [
|
|
170
|
+
"debug",
|
|
171
|
+
"info",
|
|
172
|
+
"warn",
|
|
173
|
+
"error"
|
|
174
|
+
]
|
|
175
|
+
}
|
|
176
|
+
}
|
|
177
|
+
}
|
|
178
|
+
}
|
|
179
|
+
}
|
data/exe/mammoth
ADDED
data/lib/mammoth/application.rb
CHANGED
|
@@ -1,23 +1,25 @@
|
|
|
1
1
|
# frozen_string_literal: true
|
|
2
2
|
|
|
3
3
|
module Mammoth
|
|
4
|
-
# Top-level Mammoth application
|
|
4
|
+
# Top-level Mammoth application runtime.
|
|
5
5
|
#
|
|
6
|
-
# Application wires Mammoth's
|
|
7
|
-
# SQLite operational memory, replication
|
|
8
|
-
# store, dead letter store, and webhook sink.
|
|
6
|
+
# Application wires Mammoth's delivery-side runtime pieces: configuration,
|
|
7
|
+
# SQLite operational memory, replication consumer, delivery worker, checkpoint
|
|
8
|
+
# store, dead letter store, and webhook sink. Upstream PostgreSQL transport
|
|
9
|
+
# composition stays outside this class so the application runtime consumes an
|
|
10
|
+
# injected CDC work source rather than owning upstream CDC source-adapter
|
|
11
|
+
# lifecycle decisions.
|
|
9
12
|
class Application
|
|
10
13
|
attr_reader :config, :sqlite_store, :consumer, :delivery_worker
|
|
11
14
|
|
|
12
15
|
# @param config [Mammoth::Configuration] loaded configuration
|
|
13
16
|
# @param source [#each, nil] injectable event source for tests and demos
|
|
14
|
-
# @param adapter [#call, nil] optional source adapter
|
|
15
17
|
# @param sink [#deliver, nil] optional destination sink
|
|
16
18
|
# @param sleeper [#call] retry sleep strategy
|
|
17
|
-
def initialize(config, source: nil,
|
|
19
|
+
def initialize(config, source: nil, sink: nil, sleeper: Kernel.method(:sleep))
|
|
18
20
|
@config = config
|
|
19
21
|
@sqlite_store = SQLiteStore.connect(config.dig("sqlite", "path")).bootstrap!
|
|
20
|
-
@consumer = ReplicationConsumer.new(
|
|
22
|
+
@consumer = ReplicationConsumer.new(source: source)
|
|
21
23
|
@delivery_worker = build_delivery_worker(sink: sink || WebhookSink.from_config(config), sleeper: sleeper)
|
|
22
24
|
end
|
|
23
25
|
|
|
@@ -0,0 +1,11 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module Mammoth
|
|
4
|
+
# Backwards-compatible internal alias for the PostgreSQL CDC source.
|
|
5
|
+
#
|
|
6
|
+
# New code should use {Mammoth::Sources::Postgres}. Mammoth v0.1.0 keeps this
|
|
7
|
+
# constant so older tests or examples that referenced the transitional
|
|
8
|
+
# CdcSource name continue to work while the product-facing source name moves
|
|
9
|
+
# to PostgreSQL.
|
|
10
|
+
CdcSource = Sources::Postgres
|
|
11
|
+
end
|
data/lib/mammoth/cli.rb
CHANGED
|
@@ -5,6 +5,7 @@ require "json"
|
|
|
5
5
|
module Mammoth
|
|
6
6
|
# Small command dispatcher for Mammoth's operator-facing CLI.
|
|
7
7
|
class CLI
|
|
8
|
+
# Human-readable command usage printed for invalid or incomplete invocations.
|
|
8
9
|
USAGE = [
|
|
9
10
|
"Usage:",
|
|
10
11
|
" mammoth version",
|
|
@@ -88,8 +89,9 @@ module Mammoth
|
|
|
88
89
|
end
|
|
89
90
|
|
|
90
91
|
def start
|
|
91
|
-
|
|
92
|
-
|
|
92
|
+
config = load_config
|
|
93
|
+
processed = Application.new(config, source: Sources::Postgres.new(config)).start
|
|
94
|
+
puts "Processed events: #{processed}"
|
|
93
95
|
0
|
|
94
96
|
end
|
|
95
97
|
|
|
@@ -101,7 +103,7 @@ module Mammoth
|
|
|
101
103
|
|
|
102
104
|
event = JSON.parse(File.read(event_path))
|
|
103
105
|
processed = Application.new(config, source: [event]).start
|
|
104
|
-
puts "
|
|
106
|
+
puts "Processed sample events: #{processed}"
|
|
105
107
|
0
|
|
106
108
|
rescue JSON::ParserError => e
|
|
107
109
|
raise ConfigurationError, "invalid event JSON in #{event_path}: #{e.message}"
|
|
@@ -10,7 +10,8 @@ module Mammoth
|
|
|
10
10
|
# Configuration is intentionally schema-backed so the same contract can power
|
|
11
11
|
# editor IntelliSense, preflight validation, and runtime startup checks.
|
|
12
12
|
class Configuration
|
|
13
|
-
|
|
13
|
+
# Default JSON Schema used to validate Mammoth YAML configuration files.
|
|
14
|
+
DEFAULT_SCHEMA_PATH = File.expand_path("../../config/mammoth.schema.json", __dir__.to_s)
|
|
14
15
|
|
|
15
16
|
attr_reader :path, :data, :schema_path
|
|
16
17
|
|
|
@@ -8,6 +8,7 @@ module Mammoth
|
|
|
8
8
|
# after success, and persist the failed event to the dead letter queue after
|
|
9
9
|
# retry exhaustion.
|
|
10
10
|
class DeliveryWorker
|
|
11
|
+
# Default source name used when an event does not provide one.
|
|
11
12
|
DEFAULT_SOURCE = "postgresql"
|
|
12
13
|
|
|
13
14
|
attr_reader :sink, :checkpoint_store, :dead_letter_store, :retry_schedule, :max_attempts, :sleeper, :source_name,
|
|
@@ -50,7 +51,7 @@ module Mammoth
|
|
|
50
51
|
dead_letter_store: dead_letter_store,
|
|
51
52
|
source_name: config.dig("mammoth", "name"),
|
|
52
53
|
slot_name: config.dig("replication", "slot"),
|
|
53
|
-
publication_name: config.dig("replication", "
|
|
54
|
+
publication_name: Array(config.dig("replication", "publications")).join(","),
|
|
54
55
|
max_attempts: config.dig("retry", "max_attempts"),
|
|
55
56
|
retry_schedule: config.dig("retry", "schedule_seconds"),
|
|
56
57
|
sleeper: sleeper
|
|
@@ -12,6 +12,7 @@ module Mammoth
|
|
|
12
12
|
# from PostgreSQL-specific message shapes while preserving source metadata such
|
|
13
13
|
# as commit LSN and transaction identity when available.
|
|
14
14
|
class EventSerializer
|
|
15
|
+
# Default source label used in serialized webhook payloads.
|
|
15
16
|
DEFAULT_SOURCE = "postgresql"
|
|
16
17
|
|
|
17
18
|
# Serialize an event-like object into a webhook-ready Hash.
|
|
@@ -1,35 +1,18 @@
|
|
|
1
1
|
# frozen_string_literal: true
|
|
2
2
|
|
|
3
3
|
module Mammoth
|
|
4
|
-
# Consumes CDC
|
|
4
|
+
# Consumes normalized CDC work from an injected source.
|
|
5
5
|
#
|
|
6
|
-
# ReplicationConsumer is
|
|
7
|
-
#
|
|
8
|
-
#
|
|
6
|
+
# ReplicationConsumer is intentionally upstream-agnostic. It does not know
|
|
7
|
+
# which upstream system produced the work. Its
|
|
8
|
+
# only job is to consume CDC Ecosystem work, flatten CDC transaction
|
|
9
|
+
# envelopes, and yield individual change events to the delivery pipeline.
|
|
9
10
|
class ReplicationConsumer
|
|
10
|
-
attr_reader :
|
|
11
|
+
attr_reader :source
|
|
11
12
|
|
|
12
|
-
# @param config [Mammoth::Configuration] loaded configuration
|
|
13
13
|
# @param source [#each, nil] injectable CDC work stream
|
|
14
|
-
|
|
15
|
-
def initialize(config, source: nil, adapter: nil)
|
|
16
|
-
@config = config
|
|
14
|
+
def initialize(source: nil)
|
|
17
15
|
@source = source
|
|
18
|
-
@adapter = adapter
|
|
19
|
-
end
|
|
20
|
-
|
|
21
|
-
# Return the configured replication slot.
|
|
22
|
-
#
|
|
23
|
-
# @return [String]
|
|
24
|
-
def slot
|
|
25
|
-
config.dig("replication", "slot")
|
|
26
|
-
end
|
|
27
|
-
|
|
28
|
-
# Return the configured publication.
|
|
29
|
-
#
|
|
30
|
-
# @return [String]
|
|
31
|
-
def publication
|
|
32
|
-
config.dig("replication", "publication")
|
|
33
16
|
end
|
|
34
17
|
|
|
35
18
|
# Consume normalized CDC work from the configured source.
|
|
@@ -40,36 +23,52 @@ module Mammoth
|
|
|
40
23
|
return enum_for(:start) unless block_given?
|
|
41
24
|
|
|
42
25
|
count = 0
|
|
26
|
+
|
|
43
27
|
each_event do |event|
|
|
44
28
|
yield event
|
|
45
29
|
count += 1
|
|
46
30
|
end
|
|
31
|
+
|
|
47
32
|
count
|
|
48
33
|
end
|
|
49
34
|
|
|
50
35
|
private
|
|
51
36
|
|
|
52
|
-
def each_event
|
|
53
|
-
effective_source.each do |
|
|
54
|
-
|
|
37
|
+
def each_event(&block)
|
|
38
|
+
effective_source.each do |work|
|
|
39
|
+
flatten_cdc_work(work).each(&block)
|
|
55
40
|
end
|
|
56
41
|
end
|
|
57
42
|
|
|
58
43
|
def effective_source
|
|
59
|
-
source ||
|
|
44
|
+
source || raise(ReplicationError, "replication source is not configured")
|
|
45
|
+
end
|
|
46
|
+
|
|
47
|
+
def flatten_cdc_work(work)
|
|
48
|
+
return [] if work.nil?
|
|
49
|
+
return validate_events(work.events) if transaction_envelope?(work)
|
|
50
|
+
return work.flat_map { |item| flatten_cdc_work(item) } if work.is_a?(Array)
|
|
51
|
+
|
|
52
|
+
validate_events([work])
|
|
60
53
|
end
|
|
61
54
|
|
|
62
|
-
def
|
|
63
|
-
|
|
64
|
-
flatten_cdc_work(adapted)
|
|
55
|
+
def validate_events(events)
|
|
56
|
+
events.each { |event| validate_cdc_event!(event) }
|
|
65
57
|
end
|
|
66
58
|
|
|
67
|
-
def
|
|
68
|
-
if
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
|
|
59
|
+
def validate_cdc_event!(event)
|
|
60
|
+
return event if event.respond_to?(:to_h) && cdc_event_hash?(event.to_h)
|
|
61
|
+
|
|
62
|
+
raise ReplicationError, "CDC source yielded non-CDC work: #{event.class}"
|
|
63
|
+
end
|
|
64
|
+
|
|
65
|
+
def cdc_event_hash?(event_hash)
|
|
66
|
+
return false unless event_hash.respond_to?(:key?)
|
|
67
|
+
|
|
68
|
+
has_operation = event_hash.key?("operation") || event_hash.key?(:operation)
|
|
69
|
+
has_position = event_hash.key?("source_position") || event_hash.key?(:source_position) ||
|
|
70
|
+
event_hash.key?("commit_lsn") || event_hash.key?(:commit_lsn)
|
|
71
|
+
has_operation && has_position
|
|
73
72
|
end
|
|
74
73
|
|
|
75
74
|
def transaction_envelope?(work)
|
|
@@ -0,0 +1,203 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
module Mammoth
|
|
4
|
+
module Sources
|
|
5
|
+
# Concrete PostgreSQL CDC source for Mammoth.
|
|
6
|
+
#
|
|
7
|
+
# Postgres realizes the CDC Ecosystem libraries for Mammoth's product
|
|
8
|
+
# boundary. It composes pgoutput-client, pgoutput-parser,
|
|
9
|
+
# pgoutput-decoder, and pgoutput-source-adapter into a single source that
|
|
10
|
+
# yields CDC::Core-shaped work to the delivery runtime.
|
|
11
|
+
#
|
|
12
|
+
# This class may mention pgoutput implementation details because it is the
|
|
13
|
+
# concrete PostgreSQL source adapter used by Mammoth. The rest of Mammoth
|
|
14
|
+
# should remain source-agnostic and consume only the work yielded here.
|
|
15
|
+
class Postgres
|
|
16
|
+
# @return [Mammoth::Configuration] loaded Mammoth configuration
|
|
17
|
+
attr_reader :config
|
|
18
|
+
# @return [#start, nil] injected pgoutput-client runner
|
|
19
|
+
attr_reader :runner
|
|
20
|
+
# @return [Object, nil] injected pgoutput protocol parser
|
|
21
|
+
attr_reader :parser
|
|
22
|
+
# @return [Object, nil] injected pgoutput decoder
|
|
23
|
+
attr_reader :decoder
|
|
24
|
+
# @return [Object, nil] injected CDC source adapter
|
|
25
|
+
attr_reader :adapter
|
|
26
|
+
|
|
27
|
+
# Build a PostgreSQL CDC source.
|
|
28
|
+
#
|
|
29
|
+
# @param config [Mammoth::Configuration] loaded configuration
|
|
30
|
+
# @param runner [#start, nil] injected pgoutput-client runner
|
|
31
|
+
# @param parser [Object, nil] injected pgoutput parser or relation tracker
|
|
32
|
+
# @param decoder [Object, nil] injected pgoutput decoder
|
|
33
|
+
# @param adapter [Object, nil] injected source adapter
|
|
34
|
+
def initialize(config, runner: nil, parser: nil, decoder: nil, adapter: nil)
|
|
35
|
+
@config = config
|
|
36
|
+
@runner = runner
|
|
37
|
+
@parser = parser
|
|
38
|
+
@decoder = decoder
|
|
39
|
+
@adapter = adapter
|
|
40
|
+
end
|
|
41
|
+
|
|
42
|
+
# Stream CDC::Core-shaped work from PostgreSQL logical replication.
|
|
43
|
+
#
|
|
44
|
+
# Calling this method starts the injected or configured pgoutput-client
|
|
45
|
+
# runner. The runner owns the PostgreSQL replication connection and slot
|
|
46
|
+
# lifecycle; this class only composes the parser, decoder, and adapter
|
|
47
|
+
# libraries around the stream.
|
|
48
|
+
#
|
|
49
|
+
# @yieldparam work [Object] CDC::Core::ChangeEvent or TransactionEnvelope
|
|
50
|
+
# @return [Enumerator, nil]
|
|
51
|
+
# @raise [Mammoth::ReplicationError] when the source cannot stream CDC work
|
|
52
|
+
def each(&block)
|
|
53
|
+
return enum_for(:each) unless block_given?
|
|
54
|
+
|
|
55
|
+
effective_runner.start do |payload, metadata = nil|
|
|
56
|
+
process_payload(payload, metadata, &block)
|
|
57
|
+
end
|
|
58
|
+
nil
|
|
59
|
+
rescue StandardError => e
|
|
60
|
+
raise e if e.is_a?(ReplicationError)
|
|
61
|
+
|
|
62
|
+
raise ReplicationError, "PostgreSQL CDC source failed: #{e.message}"
|
|
63
|
+
end
|
|
64
|
+
|
|
65
|
+
private
|
|
66
|
+
|
|
67
|
+
def process_payload(payload, metadata, &block)
|
|
68
|
+
parsed = parse_payload(payload)
|
|
69
|
+
decoded = decode_message(parsed, metadata)
|
|
70
|
+
normalize_decoded(decoded).each { |work| block.call(work) if work }
|
|
71
|
+
end
|
|
72
|
+
|
|
73
|
+
def parse_payload(payload)
|
|
74
|
+
parser = effective_parser
|
|
75
|
+
return parser.process(payload) if parser.respond_to?(:process)
|
|
76
|
+
return parser.parse(payload) if parser.respond_to?(:parse)
|
|
77
|
+
return parser.call(payload) if parser.respond_to?(:call)
|
|
78
|
+
|
|
79
|
+
raise ReplicationError, "pgoutput parser must respond to #process, #parse, or #call"
|
|
80
|
+
end
|
|
81
|
+
|
|
82
|
+
def decode_message(message, metadata)
|
|
83
|
+
decoder = effective_decoder
|
|
84
|
+
if decoder.respond_to?(:decode)
|
|
85
|
+
return callable_accepts_metadata?(decoder, :decode) ? decoder.decode(message, metadata) : decoder.decode(message)
|
|
86
|
+
end
|
|
87
|
+
if decoder.respond_to?(:call)
|
|
88
|
+
return callable_accepts_metadata?(decoder, :call) ? decoder.call(message, metadata) : decoder.call(message)
|
|
89
|
+
end
|
|
90
|
+
|
|
91
|
+
raise ReplicationError, "pgoutput decoder must respond to #decode or #call"
|
|
92
|
+
end
|
|
93
|
+
|
|
94
|
+
def normalize_decoded(decoded)
|
|
95
|
+
return [] if decoded.nil?
|
|
96
|
+
return decoded.flat_map { |item| normalize_decoded(item) } if decoded.is_a?(Array)
|
|
97
|
+
|
|
98
|
+
adapter = effective_adapter
|
|
99
|
+
result = if adapter.respond_to?(:normalize)
|
|
100
|
+
adapter.normalize(decoded)
|
|
101
|
+
elsif adapter.respond_to?(:call)
|
|
102
|
+
adapter.call(decoded)
|
|
103
|
+
else
|
|
104
|
+
raise ReplicationError, "pgoutput source adapter must respond to #normalize or #call"
|
|
105
|
+
end
|
|
106
|
+
|
|
107
|
+
result.is_a?(Array) ? result.compact : [result].compact
|
|
108
|
+
end
|
|
109
|
+
|
|
110
|
+
def effective_runner
|
|
111
|
+
runner || (@effective_runner ||= build_runner)
|
|
112
|
+
end
|
|
113
|
+
|
|
114
|
+
def effective_parser
|
|
115
|
+
parser || (@effective_parser ||= build_parser)
|
|
116
|
+
end
|
|
117
|
+
|
|
118
|
+
def effective_decoder
|
|
119
|
+
decoder || (@effective_decoder ||= build_decoder)
|
|
120
|
+
end
|
|
121
|
+
|
|
122
|
+
def effective_adapter
|
|
123
|
+
adapter || (@effective_adapter ||= build_adapter)
|
|
124
|
+
end
|
|
125
|
+
|
|
126
|
+
def build_runner
|
|
127
|
+
require_optional!("pgoutput/client", "pgoutput-client")
|
|
128
|
+
|
|
129
|
+
Pgoutput::Client::Runner.new(**runner_options)
|
|
130
|
+
end
|
|
131
|
+
|
|
132
|
+
def build_parser
|
|
133
|
+
require_optional!("pgoutput", "pgoutput-parser")
|
|
134
|
+
|
|
135
|
+
Pgoutput::RelationTracker.new
|
|
136
|
+
end
|
|
137
|
+
|
|
138
|
+
def build_decoder
|
|
139
|
+
require_optional!("pgoutput/decoder", "pgoutput-decoder")
|
|
140
|
+
|
|
141
|
+
Pgoutput::Decoder.new
|
|
142
|
+
end
|
|
143
|
+
|
|
144
|
+
def build_adapter
|
|
145
|
+
require_optional!("pgoutput/source_adapter", "pgoutput-source-adapter")
|
|
146
|
+
|
|
147
|
+
Pgoutput::SourceAdapter::Cdc.new
|
|
148
|
+
end
|
|
149
|
+
|
|
150
|
+
def runner_options
|
|
151
|
+
{
|
|
152
|
+
database_url: database_url,
|
|
153
|
+
slot_name: required_config("replication", "slot"),
|
|
154
|
+
publication_names: required_publications,
|
|
155
|
+
start_lsn: config.dig("replication", "start_lsn"),
|
|
156
|
+
auto_create_slot: !config.dig("replication", "auto_create_slot").nil?,
|
|
157
|
+
temporary_slot: !config.dig("replication", "temporary_slot").nil?
|
|
158
|
+
}.tap do |options|
|
|
159
|
+
feedback_interval = config.dig("replication", "feedback_interval")
|
|
160
|
+
options[:feedback_interval] = feedback_interval unless feedback_interval.nil?
|
|
161
|
+
end
|
|
162
|
+
end
|
|
163
|
+
|
|
164
|
+
def required_publications
|
|
165
|
+
publications = required_config("replication", "publications")
|
|
166
|
+
unless publications.is_a?(Array) && publications.any? &&
|
|
167
|
+
publications.all? { |publication| publication.is_a?(String) && !publication.empty? }
|
|
168
|
+
raise ReplicationError, "missing PostgreSQL source config: replication.publications"
|
|
169
|
+
end
|
|
170
|
+
|
|
171
|
+
publications
|
|
172
|
+
end
|
|
173
|
+
|
|
174
|
+
def callable_accepts_metadata?(object, method_name)
|
|
175
|
+
arity = object.respond_to?(:arity) && method_name == :call ? object.arity : object.method(method_name).arity
|
|
176
|
+
arity != 1
|
|
177
|
+
end
|
|
178
|
+
|
|
179
|
+
def database_url
|
|
180
|
+
password = ENV.fetch(required_config("postgres", "password_env"), nil)
|
|
181
|
+
credentials = required_config("postgres", "username").dup
|
|
182
|
+
credentials << ":#{password}" if password
|
|
183
|
+
|
|
184
|
+
"postgres://#{credentials}@#{required_config("postgres", "host")}:#{required_config("postgres", "port")}/" \
|
|
185
|
+
"#{required_config("postgres", "database")}"
|
|
186
|
+
end
|
|
187
|
+
|
|
188
|
+
def required_config(*keys)
|
|
189
|
+
value = config.dig(*keys)
|
|
190
|
+
raise ReplicationError, "missing PostgreSQL source config: #{keys.join(".")}" if value.nil? || value == ""
|
|
191
|
+
|
|
192
|
+
value
|
|
193
|
+
end
|
|
194
|
+
|
|
195
|
+
def require_optional!(feature, gem_name)
|
|
196
|
+
require feature
|
|
197
|
+
true
|
|
198
|
+
rescue LoadError => e
|
|
199
|
+
raise ReplicationError, "#{gem_name} is required for PostgreSQL CDC source integration: #{e.message}"
|
|
200
|
+
end
|
|
201
|
+
end
|
|
202
|
+
end
|
|
203
|
+
end
|
data/lib/mammoth/sqlite_store.rb
CHANGED
|
@@ -11,10 +11,15 @@ module Mammoth
|
|
|
11
11
|
# inspectable state required for reliability: schema versions, checkpoints,
|
|
12
12
|
# and dead letters.
|
|
13
13
|
class SQLiteStore
|
|
14
|
-
|
|
15
|
-
|
|
14
|
+
# Default SQLite database path used by local Mammoth runs.
|
|
15
|
+
DEFAULT_DB_PATH = File.expand_path("../../.sqlite3/mammoth.db", __dir__.to_s)
|
|
16
|
+
# Directory containing bundled SQLite schema migration files.
|
|
17
|
+
MIGRATION_DIR = File.expand_path("sql", __dir__.to_s)
|
|
18
|
+
# Initial schema migration file applied to new SQLite stores.
|
|
16
19
|
BOOTSTRAP_FILE = "__bootstrap__.sql"
|
|
20
|
+
# Synthetic schema version recorded after the bootstrap migration succeeds.
|
|
17
21
|
BOOTSTRAP_VERSION = "__bootstrap__"
|
|
22
|
+
# Table that records applied SQLite schema migrations.
|
|
18
23
|
MIGRATIONS_TABLE = "schema_migrations"
|
|
19
24
|
|
|
20
25
|
attr_reader :path
|
data/lib/mammoth/status.rb
CHANGED
|
@@ -26,6 +26,8 @@ module Mammoth
|
|
|
26
26
|
# @return [void]
|
|
27
27
|
def call
|
|
28
28
|
puts "Mammoth: #{config.dig("mammoth", "name")}"
|
|
29
|
+
puts "Replication slot: #{config.dig("replication", "slot")}"
|
|
30
|
+
puts "Replication publications: #{Array(config.dig("replication", "publications")).join(", ")}"
|
|
29
31
|
puts "Runtime: not started"
|
|
30
32
|
puts "SQLite: #{sqlite_path}"
|
|
31
33
|
puts "Webhook: #{config.dig("webhook", "name")}"
|
data/lib/mammoth/version.rb
CHANGED
data/lib/mammoth/webhook_sink.rb
CHANGED
|
@@ -2,11 +2,13 @@
|
|
|
2
2
|
|
|
3
3
|
require "json"
|
|
4
4
|
require "net/http"
|
|
5
|
+
require "socket"
|
|
5
6
|
require "uri"
|
|
6
7
|
|
|
7
8
|
module Mammoth
|
|
8
9
|
# Delivers normalized Mammoth events to a webhook endpoint.
|
|
9
10
|
class WebhookSink
|
|
11
|
+
# HTTP status range treated as successful webhook delivery.
|
|
10
12
|
SUCCESS_RANGE = 200..299
|
|
11
13
|
|
|
12
14
|
attr_reader :name, :url, :timeout_seconds
|
data/lib/mammoth.rb
CHANGED
|
@@ -10,7 +10,8 @@ require_relative "mammoth/dead_letter_store"
|
|
|
10
10
|
require_relative "mammoth/event_serializer"
|
|
11
11
|
require_relative "mammoth/webhook_sink"
|
|
12
12
|
require_relative "mammoth/delivery_worker"
|
|
13
|
-
require_relative "mammoth/
|
|
13
|
+
require_relative "mammoth/sources/postgres"
|
|
14
|
+
require_relative "mammoth/cdc_source"
|
|
14
15
|
require_relative "mammoth/replication_consumer"
|
|
15
16
|
require_relative "mammoth/application"
|
|
16
17
|
require_relative "mammoth/cli"
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: mammoth
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 0.
|
|
4
|
+
version: 0.1.0
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- Ken C. Demanawa
|
|
@@ -10,63 +10,63 @@ cert_chain: []
|
|
|
10
10
|
date: 1980-01-02 00:00:00.000000000 Z
|
|
11
11
|
dependencies:
|
|
12
12
|
- !ruby/object:Gem::Dependency
|
|
13
|
-
name:
|
|
13
|
+
name: cdc-core
|
|
14
14
|
requirement: !ruby/object:Gem::Requirement
|
|
15
15
|
requirements:
|
|
16
16
|
- - "~>"
|
|
17
17
|
- !ruby/object:Gem::Version
|
|
18
|
-
version: '
|
|
18
|
+
version: '0.1'
|
|
19
19
|
type: :runtime
|
|
20
20
|
prerelease: false
|
|
21
21
|
version_requirements: !ruby/object:Gem::Requirement
|
|
22
22
|
requirements:
|
|
23
23
|
- - "~>"
|
|
24
24
|
- !ruby/object:Gem::Version
|
|
25
|
-
version: '
|
|
25
|
+
version: '0.1'
|
|
26
26
|
- !ruby/object:Gem::Dependency
|
|
27
|
-
name:
|
|
27
|
+
name: json-schema
|
|
28
28
|
requirement: !ruby/object:Gem::Requirement
|
|
29
29
|
requirements:
|
|
30
30
|
- - "~>"
|
|
31
31
|
- !ruby/object:Gem::Version
|
|
32
|
-
version: '2
|
|
32
|
+
version: '6.2'
|
|
33
33
|
type: :runtime
|
|
34
34
|
prerelease: false
|
|
35
35
|
version_requirements: !ruby/object:Gem::Requirement
|
|
36
36
|
requirements:
|
|
37
37
|
- - "~>"
|
|
38
38
|
- !ruby/object:Gem::Version
|
|
39
|
-
version: '2
|
|
39
|
+
version: '6.2'
|
|
40
40
|
- !ruby/object:Gem::Dependency
|
|
41
|
-
name:
|
|
41
|
+
name: pgoutput-client
|
|
42
42
|
requirement: !ruby/object:Gem::Requirement
|
|
43
43
|
requirements:
|
|
44
44
|
- - "~>"
|
|
45
45
|
- !ruby/object:Gem::Version
|
|
46
|
-
version: '0.
|
|
46
|
+
version: '0.2'
|
|
47
47
|
type: :runtime
|
|
48
48
|
prerelease: false
|
|
49
49
|
version_requirements: !ruby/object:Gem::Requirement
|
|
50
50
|
requirements:
|
|
51
51
|
- - "~>"
|
|
52
52
|
- !ruby/object:Gem::Version
|
|
53
|
-
version: '0.
|
|
53
|
+
version: '0.2'
|
|
54
54
|
- !ruby/object:Gem::Dependency
|
|
55
|
-
name: pgoutput-
|
|
55
|
+
name: pgoutput-decoder
|
|
56
56
|
requirement: !ruby/object:Gem::Requirement
|
|
57
57
|
requirements:
|
|
58
58
|
- - "~>"
|
|
59
59
|
- !ruby/object:Gem::Version
|
|
60
|
-
version: '0.
|
|
60
|
+
version: '0.1'
|
|
61
61
|
type: :runtime
|
|
62
62
|
prerelease: false
|
|
63
63
|
version_requirements: !ruby/object:Gem::Requirement
|
|
64
64
|
requirements:
|
|
65
65
|
- - "~>"
|
|
66
66
|
- !ruby/object:Gem::Version
|
|
67
|
-
version: '0.
|
|
67
|
+
version: '0.1'
|
|
68
68
|
- !ruby/object:Gem::Dependency
|
|
69
|
-
name: pgoutput-
|
|
69
|
+
name: pgoutput-parser
|
|
70
70
|
requirement: !ruby/object:Gem::Requirement
|
|
71
71
|
requirements:
|
|
72
72
|
- - "~>"
|
|
@@ -80,7 +80,7 @@ dependencies:
|
|
|
80
80
|
- !ruby/object:Gem::Version
|
|
81
81
|
version: '0.1'
|
|
82
82
|
- !ruby/object:Gem::Dependency
|
|
83
|
-
name: pgoutput-
|
|
83
|
+
name: pgoutput-source-adapter
|
|
84
84
|
requirement: !ruby/object:Gem::Requirement
|
|
85
85
|
requirements:
|
|
86
86
|
- - "~>"
|
|
@@ -94,39 +94,44 @@ dependencies:
|
|
|
94
94
|
- !ruby/object:Gem::Version
|
|
95
95
|
version: '0.1'
|
|
96
96
|
- !ruby/object:Gem::Dependency
|
|
97
|
-
name:
|
|
97
|
+
name: sqlite3
|
|
98
98
|
requirement: !ruby/object:Gem::Requirement
|
|
99
99
|
requirements:
|
|
100
100
|
- - "~>"
|
|
101
101
|
- !ruby/object:Gem::Version
|
|
102
|
-
version: '
|
|
102
|
+
version: '2.9'
|
|
103
103
|
type: :runtime
|
|
104
104
|
prerelease: false
|
|
105
105
|
version_requirements: !ruby/object:Gem::Requirement
|
|
106
106
|
requirements:
|
|
107
107
|
- - "~>"
|
|
108
108
|
- !ruby/object:Gem::Version
|
|
109
|
-
version: '
|
|
109
|
+
version: '2.9'
|
|
110
110
|
description: |
|
|
111
111
|
Mammoth is an OSS PostgreSQL change-event delivery appliance for Ruby.
|
|
112
112
|
|
|
113
|
-
It
|
|
114
|
-
|
|
113
|
+
It realizes the CDC Ecosystem pgoutput and cdc-core libraries for PostgreSQL,
|
|
114
|
+
then delivers normalized changes to webhook endpoints with durable
|
|
115
115
|
checkpointing, retry state, dead letters, and operational visibility.
|
|
116
116
|
|
|
117
117
|
Mammoth is application-first: it can be installed as a Ruby gem, packaged
|
|
118
118
|
into a container image, or deployed into Kubernetes with Helm.
|
|
119
119
|
email:
|
|
120
120
|
- kenneth.c.demanawa@gmail.com
|
|
121
|
-
executables:
|
|
121
|
+
executables:
|
|
122
|
+
- mammoth
|
|
122
123
|
extensions: []
|
|
123
124
|
extra_rdoc_files: []
|
|
124
125
|
files:
|
|
125
126
|
- CHANGELOG.md
|
|
126
127
|
- LICENSE.txt
|
|
127
128
|
- README.md
|
|
129
|
+
- config/mammoth.example.yml
|
|
130
|
+
- config/mammoth.schema.json
|
|
131
|
+
- exe/mammoth
|
|
128
132
|
- lib/mammoth.rb
|
|
129
133
|
- lib/mammoth/application.rb
|
|
134
|
+
- lib/mammoth/cdc_source.rb
|
|
130
135
|
- lib/mammoth/checkpoint_store.rb
|
|
131
136
|
- lib/mammoth/cli.rb
|
|
132
137
|
- lib/mammoth/configuration.rb
|
|
@@ -134,8 +139,8 @@ files:
|
|
|
134
139
|
- lib/mammoth/delivery_worker.rb
|
|
135
140
|
- lib/mammoth/errors.rb
|
|
136
141
|
- lib/mammoth/event_serializer.rb
|
|
137
|
-
- lib/mammoth/pgoutput_source.rb
|
|
138
142
|
- lib/mammoth/replication_consumer.rb
|
|
143
|
+
- lib/mammoth/sources/postgres.rb
|
|
139
144
|
- lib/mammoth/sql/__bootstrap__.sql
|
|
140
145
|
- lib/mammoth/sqlite_store.rb
|
|
141
146
|
- lib/mammoth/status.rb
|
|
@@ -1,166 +0,0 @@
|
|
|
1
|
-
# frozen_string_literal: true
|
|
2
|
-
|
|
3
|
-
module Mammoth
|
|
4
|
-
# Streams PostgreSQL logical replication through the CDC Ecosystem boundary.
|
|
5
|
-
#
|
|
6
|
-
# PgoutputSource is Mammoth's upstream integration point. It composes the
|
|
7
|
-
# standalone pgoutput transport, parser, decoder, and source-adapter gems so
|
|
8
|
-
# the rest of Mammoth only receives CDC-core domain objects. Transport
|
|
9
|
-
# resiliency remains owned by pgoutput-client; Mammoth owns delivery.
|
|
10
|
-
class PgoutputSource
|
|
11
|
-
# @return [Mammoth::Configuration] loaded Mammoth configuration
|
|
12
|
-
attr_reader :config
|
|
13
|
-
# @return [Object, nil] pgoutput-client compatible runner
|
|
14
|
-
attr_reader :runner
|
|
15
|
-
# @return [Object, nil] pgoutput-parser compatible parser
|
|
16
|
-
attr_reader :parser
|
|
17
|
-
# @return [Object, nil] pgoutput-decoder compatible decoder
|
|
18
|
-
attr_reader :decoder
|
|
19
|
-
# @return [Object, nil] CDC source adapter
|
|
20
|
-
attr_reader :source_adapter
|
|
21
|
-
|
|
22
|
-
# Build the pgoutput integration source.
|
|
23
|
-
#
|
|
24
|
-
# @param config [Mammoth::Configuration] loaded configuration
|
|
25
|
-
# @param runner [Object, nil] injectable pgoutput-client runner
|
|
26
|
-
# @param parser [Object, nil] injectable pgoutput parser
|
|
27
|
-
# @param decoder [Object, nil] injectable pgoutput decoder
|
|
28
|
-
# @param source_adapter [Object, nil] injectable CDC source adapter
|
|
29
|
-
def initialize(config, runner: nil, parser: nil, decoder: nil, source_adapter: nil)
|
|
30
|
-
@config = config
|
|
31
|
-
@runner = runner
|
|
32
|
-
@parser = parser
|
|
33
|
-
@decoder = decoder
|
|
34
|
-
@source_adapter = source_adapter
|
|
35
|
-
end
|
|
36
|
-
|
|
37
|
-
# Stream CDC-core objects from PostgreSQL.
|
|
38
|
-
#
|
|
39
|
-
# @yieldparam work [Object] CDC::Core::ChangeEvent or TransactionEnvelope
|
|
40
|
-
# @return [void]
|
|
41
|
-
# @raise [Mammoth::ReplicationError] when required CDC components are unavailable
|
|
42
|
-
def each
|
|
43
|
-
return enum_for(:each) unless block_given?
|
|
44
|
-
|
|
45
|
-
effective_runner.start do |payload, metadata|
|
|
46
|
-
normalized_items(payload, metadata).each { |item| yield item }
|
|
47
|
-
end
|
|
48
|
-
end
|
|
49
|
-
|
|
50
|
-
private
|
|
51
|
-
|
|
52
|
-
def normalized_items(payload, metadata)
|
|
53
|
-
decoded = effective_decoder ? invoke_component(effective_decoder, parsed_payload(payload), metadata) : parsed_payload(payload)
|
|
54
|
-
normalized = invoke_source_adapter(decoded, metadata)
|
|
55
|
-
Array(normalized).flatten
|
|
56
|
-
end
|
|
57
|
-
|
|
58
|
-
def parsed_payload(payload)
|
|
59
|
-
return payload unless effective_parser
|
|
60
|
-
|
|
61
|
-
invoke_component(effective_parser, payload)
|
|
62
|
-
end
|
|
63
|
-
|
|
64
|
-
def invoke_source_adapter(decoded, metadata)
|
|
65
|
-
adapter = effective_source_adapter
|
|
66
|
-
if adapter.respond_to?(:normalize)
|
|
67
|
-
adapter.normalize(decoded)
|
|
68
|
-
elsif adapter.respond_to?(:call)
|
|
69
|
-
adapter.call(decoded, metadata)
|
|
70
|
-
else
|
|
71
|
-
raise ReplicationError, "pgoutput source adapter must respond to #normalize or #call"
|
|
72
|
-
end
|
|
73
|
-
end
|
|
74
|
-
|
|
75
|
-
def invoke_component(component, *args)
|
|
76
|
-
if component.respond_to?(:call)
|
|
77
|
-
component.call(*args)
|
|
78
|
-
elsif component.respond_to?(:parse)
|
|
79
|
-
component.parse(*args)
|
|
80
|
-
elsif component.respond_to?(:decode)
|
|
81
|
-
component.decode(*args)
|
|
82
|
-
else
|
|
83
|
-
raise ReplicationError, "#{component.class} must respond to #call, #parse, or #decode"
|
|
84
|
-
end
|
|
85
|
-
end
|
|
86
|
-
|
|
87
|
-
def effective_runner
|
|
88
|
-
@runner ||= build_runner
|
|
89
|
-
end
|
|
90
|
-
|
|
91
|
-
def effective_parser
|
|
92
|
-
@parser ||= build_parser
|
|
93
|
-
end
|
|
94
|
-
|
|
95
|
-
def effective_decoder
|
|
96
|
-
@decoder ||= build_decoder
|
|
97
|
-
end
|
|
98
|
-
|
|
99
|
-
def effective_source_adapter
|
|
100
|
-
@source_adapter ||= build_source_adapter
|
|
101
|
-
end
|
|
102
|
-
|
|
103
|
-
def build_runner
|
|
104
|
-
require_optional!("pgoutput_client", "pgoutput-client")
|
|
105
|
-
Pgoutput::Client::Runner.new(
|
|
106
|
-
database_url: database_url,
|
|
107
|
-
slot_name: config.dig("replication", "slot"),
|
|
108
|
-
publication_names: [config.dig("replication", "publication")],
|
|
109
|
-
start_lsn: config.dig("replication", "start_lsn"),
|
|
110
|
-
auto_create_slot: config.dig("replication", "auto_create_slot") || false
|
|
111
|
-
)
|
|
112
|
-
end
|
|
113
|
-
|
|
114
|
-
def build_parser
|
|
115
|
-
require_any!(["pgoutput_parser", "pgoutput/parser"], "pgoutput-parser")
|
|
116
|
-
constant_or_nil("Pgoutput::Parser") || constant_or_nil("Pgoutput::Parser::Parser")
|
|
117
|
-
end
|
|
118
|
-
|
|
119
|
-
def build_decoder
|
|
120
|
-
require_any!(["pgoutput_decoder", "pgoutput/decoder"], "pgoutput-decoder")
|
|
121
|
-
constant_or_nil("Pgoutput::Decoder") || constant_or_nil("Pgoutput::Decoder::ValueDecoder")
|
|
122
|
-
end
|
|
123
|
-
|
|
124
|
-
def build_source_adapter
|
|
125
|
-
require_optional!("cdc_core", "cdc-core")
|
|
126
|
-
require_any!(["pgoutput_source_adapter", "pgoutput/source_adapter/cdc"], "pgoutput-source-adapter")
|
|
127
|
-
|
|
128
|
-
adapter_class = constant_or_nil("Pgoutput::SourceAdapter::Cdc")
|
|
129
|
-
raise ReplicationError, "Pgoutput::SourceAdapter::Cdc is unavailable" unless adapter_class
|
|
130
|
-
|
|
131
|
-
adapter_class.new
|
|
132
|
-
end
|
|
133
|
-
|
|
134
|
-
def database_url
|
|
135
|
-
password = ENV.fetch(config.dig("postgres", "password_env"), "")
|
|
136
|
-
user = config.dig("postgres", "username")
|
|
137
|
-
host = config.dig("postgres", "host")
|
|
138
|
-
port = config.dig("postgres", "port")
|
|
139
|
-
database = config.dig("postgres", "database")
|
|
140
|
-
"postgres://#{user}:#{password}@#{host}:#{port}/#{database}"
|
|
141
|
-
end
|
|
142
|
-
|
|
143
|
-
def require_optional!(feature, gem_name)
|
|
144
|
-
require feature
|
|
145
|
-
rescue LoadError => e
|
|
146
|
-
raise ReplicationError, "#{gem_name} is required for live pgoutput replication: #{e.message}"
|
|
147
|
-
end
|
|
148
|
-
|
|
149
|
-
def require_any!(features, gem_name)
|
|
150
|
-
errors = []
|
|
151
|
-
features.each do |feature|
|
|
152
|
-
require feature
|
|
153
|
-
return true
|
|
154
|
-
rescue LoadError => e
|
|
155
|
-
errors << e.message
|
|
156
|
-
end
|
|
157
|
-
raise ReplicationError, "#{gem_name} is required for live pgoutput replication: #{errors.join("; ")}"
|
|
158
|
-
end
|
|
159
|
-
|
|
160
|
-
def constant_or_nil(name)
|
|
161
|
-
name.split("::").reduce(Object) { |scope, const_name| scope.const_get(const_name, false) }
|
|
162
|
-
rescue NameError
|
|
163
|
-
nil
|
|
164
|
-
end
|
|
165
|
-
end
|
|
166
|
-
end
|