fluent-plugin-parser-avro 0.1.0 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/LICENSE +1 -1
- data/README.md +46 -7
- data/{fluent-plugin-avro.gemspec → fluent-plugin-parser-avro.gemspec} +1 -1
- data/lib/fluent/plugin/confluent_avro_schema_registry.rb +49 -0
- data/lib/fluent/plugin/parser_avro.rb +55 -32
- data/test/data/schema-persions-value-1.avsc +1 -0
- data/test/data/schema-persions-value-21.avsc +1 -0
- data/test/data/schema-persions-value-41.avsc +1 -0
- data/test/data/schema-persions-value-42.avsc +1 -0
- data/test/plugin/test_parser_avro.rb +120 -36
- metadata +12 -3
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 9633a316c1de1e4cb83b487d99d5a3814f1cde2b96d20633cfe728c584108761
|
4
|
+
data.tar.gz: 6438fe5e248c98540602183a5828a7dbc596fe5000c4993fd428ee9223ed3077
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 8b8894c17aaa33916ef3601634d36139f2817db875eba2c5e552a0309f2c5f3ce6578526c5aad54244cd51ce36cf30bc04f64ce576fcbd4ccb89fc7726d4102d
|
7
|
+
data.tar.gz: a9dbc9e144b1e0f7f2b585a51c0519e6c2953d7258b5bd13fd8d6246754f9956bffcee43777378692ea4ef2d8997d8f6648653d8928e2345e70c32e48ab849a6
|
data/LICENSE
CHANGED
@@ -187,7 +187,7 @@
|
|
187
187
|
same "printed page" as the copyright notice for easier
|
188
188
|
identification within third-party archives.
|
189
189
|
|
190
|
-
Copyright
|
190
|
+
Copyright [yyyy] [name of copyright owner]
|
191
191
|
|
192
192
|
Licensed under the Apache License, Version 2.0 (the "License");
|
193
193
|
you may not use this file except in compliance with the License.
|
data/README.md
CHANGED
@@ -32,12 +32,22 @@ $ bundle
|
|
32
32
|
* **schema_file** (string) (optional): avro schema file path.
|
33
33
|
* **schema_json** (string) (optional): avro schema definition hash.
|
34
34
|
* **schema_url** (string) (optional): avro schema remote URL.
|
35
|
-
* **schema_registery_with_subject_url** (string) (optional): avro schema registry URL.
|
36
35
|
* **schema_url_key** (string) (optional): avro schema registry or something's response schema key.
|
37
36
|
* **writers_schema_file** (string) (optional): avro schema file path for writers definition.
|
38
37
|
* **writers_schema_json** (string) (optional): avro schema definition hash for writers definition.
|
39
38
|
* **readers_schema_file** (string) (optional): avro schema file path for readers definition.
|
40
39
|
* **readers_schema_json** (string) (optional): avro schema definition hash for readers definition.
|
40
|
+
* **use_confluent_schema** (bool) (optional): Assume to use confluent schema. Confluent avro schema uses the first 5-bytes for magic byte (1 byte) and schema_id (4 bytes). This parameter specifies to skip reading the first 5-bytes or not.
|
41
|
+
* Default value: `true`.
|
42
|
+
|
43
|
+
### \<confluent_registry\> section (optional) (single)
|
44
|
+
|
45
|
+
* **url** (string) (required): confluent schema registry URL.
|
46
|
+
* **subject** (string) (required): Specify schema subject.
|
47
|
+
* **schema_key** (string) (optional): Specify schema key on confluent registry REST API response.
|
48
|
+
* Default value: `schema`.
|
49
|
+
* **schema_version** (string) (optional): Specify schema version for the specified subject.
|
50
|
+
* Default value: `latest`.
|
41
51
|
|
42
52
|
### Configuration Example
|
43
53
|
|
@@ -48,7 +58,14 @@ $ bundle
|
|
48
58
|
# schema_json { "namespace": "org.fluentd.parser.avro", "type": "record", "name": "User", "fields" : [{"name": "username", "type": "string"}, {"name": "age", "type": "int"}, {"name": "verified", "type": ["boolean", "null"], "default": false}]}
|
49
59
|
# schema_url http(s)://[server fqdn]:[port]/subjects/[a great user's subject]/[the latest schema version]
|
50
60
|
# schema_key schema
|
51
|
-
#
|
61
|
+
# When using with confluent registry without <confluent_registry>, this parameter must be true.
|
62
|
+
# use_confluent_schema true
|
63
|
+
#<confluent_registry>
|
64
|
+
# url http://localhost:8081/
|
65
|
+
# subject your-awesome-subject
|
66
|
+
# # schema_key schema
|
67
|
+
# # schema_version 1
|
68
|
+
#</confluent_registry>
|
52
69
|
</parse>
|
53
70
|
```
|
54
71
|
|
@@ -58,22 +75,44 @@ Confluent AVRO schema registry should respond with REST API.
|
|
58
75
|
|
59
76
|
This plugin uses the following API:
|
60
77
|
|
61
|
-
* [`GET /subjects/(string: subject)/versions`](https://docs.confluent.io/current/schema-registry/develop/api.html#get--subjects-(string-%20subject)-versions)
|
62
78
|
* [`GET /subjects/(string: subject)/versions/(versionId: version)`](https://docs.confluent.io/current/schema-registry/develop/api.html#get--subjects-(string-%20subject)-versions)
|
63
79
|
|
64
|
-
Users can specify a URL for retrieving the latest schemna information
|
80
|
+
Users can specify a URL for retrieving the latest schemna information with `<confluent_registry>`:
|
65
81
|
|
66
|
-
e.g.)
|
82
|
+
e.g.)
|
83
|
+
```
|
84
|
+
<confluent_registry>
|
85
|
+
url http://[confluent registry server ip]:[port]/
|
86
|
+
subject your-awesome-subject
|
87
|
+
# schema_key schema
|
88
|
+
# schema_version 1
|
89
|
+
</confluent_registry>
|
90
|
+
```
|
67
91
|
|
68
92
|
For example, when specifying the following configuration:
|
69
93
|
|
70
94
|
```
|
71
95
|
<parse>
|
72
96
|
@type avro
|
73
|
-
|
97
|
+
<confluent_registry>
|
98
|
+
url http://localhost:8081/
|
99
|
+
subject persons-avro-value
|
100
|
+
# schema_key schema
|
101
|
+
# schema_version 1
|
102
|
+
</confluent_registry>
|
74
103
|
```
|
75
104
|
|
76
|
-
Then the parser plugin calls `GET http://localhost:8081/subjects/persons-avro-value/versions
|
105
|
+
Then the parser plugin calls `GET http://localhost:8081/subjects/persons-avro-value/versions/latest` to retrive the registered schema versions. And when parsing failure occurred, this plugin will call `GET http://localhost:8081/schemas/ids/<schema id which is obtained from the second record on avro schema>`.
|
106
|
+
|
107
|
+
If you use this plugin to parse confluent schema, please specify `use_confluent_schema` as `true`.
|
108
|
+
|
109
|
+
This is because, confluent avro schema uses the following structure:
|
110
|
+
|
111
|
+
MAGIC_BYTE | schema_id | record
|
112
|
+
----------:|:---------:|:---------------
|
113
|
+
1byte | 4bytes | record contents
|
114
|
+
|
115
|
+
When specifying `<confluent_registry>` section on configuration, this plugin will skip to read the first 5-bytes automatically and parse `schema_id` from there.
|
77
116
|
|
78
117
|
## Copyright
|
79
118
|
|
@@ -0,0 +1,49 @@
|
|
1
|
+
#
|
2
|
+
# Copyright 2020- Hiroshi Hatake
|
3
|
+
#
|
4
|
+
# Licensed under the Apache License, Version 2.0 (the "License");
|
5
|
+
# you may not use this file except in compliance with the License.
|
6
|
+
# You may obtain a copy of the License at
|
7
|
+
#
|
8
|
+
# http://www.apache.org/licenses/LICENSE-2.0
|
9
|
+
#
|
10
|
+
# Unless required by applicable law or agreed to in writing, software
|
11
|
+
# distributed under the License is distributed on an "AS IS" BASIS,
|
12
|
+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
13
|
+
# See the License for the specific language governing permissions and
|
14
|
+
# limitations under the License.
|
15
|
+
|
16
|
+
require "net/http"
|
17
|
+
require "uri"
|
18
|
+
|
19
|
+
module Fluent
|
20
|
+
module Plugin
|
21
|
+
class ConfluentAvroSchemaRegistry
|
22
|
+
def initialize(registry_url)
|
23
|
+
@registry_url = registry_url
|
24
|
+
end
|
25
|
+
|
26
|
+
def subject_version(subject, schema_key, version = "latest")
|
27
|
+
registry_uri = URI.parse(@registry_url)
|
28
|
+
registry_uri_with_versions = URI.join(registry_uri, "/subjects/#{subject}/versions/#{version}")
|
29
|
+
response = Net::HTTP.get_response(registry_uri_with_versions)
|
30
|
+
if schema_key.nil?
|
31
|
+
response.body
|
32
|
+
else
|
33
|
+
Yajl.load(response.body)[schema_key]
|
34
|
+
end
|
35
|
+
end
|
36
|
+
|
37
|
+
def schema_with_id(schema_id, schema_key)
|
38
|
+
registry_uri = URI.parse(@registry_url)
|
39
|
+
registry_uri_with_ids = URI.join(registry_uri, "/schemas/ids/#{schema_id}")
|
40
|
+
response = Net::HTTP.get_response(registry_uri_with_ids)
|
41
|
+
if schema_key.nil?
|
42
|
+
response.body
|
43
|
+
else
|
44
|
+
Yajl.load(response.body)[schema_key]
|
45
|
+
end
|
46
|
+
end
|
47
|
+
end
|
48
|
+
end
|
49
|
+
end
|
@@ -18,21 +18,30 @@ require "net/http"
|
|
18
18
|
require "stringio"
|
19
19
|
require "uri"
|
20
20
|
require "fluent/plugin/parser"
|
21
|
+
require_relative "./confluent_avro_schema_registry"
|
21
22
|
|
22
23
|
module Fluent
|
23
24
|
module Plugin
|
24
25
|
class AvroParser < Fluent::Plugin::Parser
|
25
26
|
Fluent::Plugin.register_parser("avro", self)
|
26
27
|
|
28
|
+
MAGIC_BYTE = [0].pack("C").freeze
|
29
|
+
|
27
30
|
config_param :schema_file, :string, :default => nil
|
28
31
|
config_param :schema_json, :string, :default => nil
|
29
32
|
config_param :schema_url, :string, :default => nil
|
30
|
-
config_param :schema_registery_with_subject_url, :string, :default => nil
|
31
33
|
config_param :schema_url_key, :string, :default => nil
|
32
34
|
config_param :writers_schema_file, :string, :default => nil
|
33
35
|
config_param :writers_schema_json, :string, :default => nil
|
34
36
|
config_param :readers_schema_file, :string, :default => nil
|
35
37
|
config_param :readers_schema_json, :string, :default => nil
|
38
|
+
config_param :use_confluent_schema, :bool, :default => true
|
39
|
+
config_section :confluent_registry, param_name: :avro_registry, required: false, multi: false do
|
40
|
+
config_param :url, :string
|
41
|
+
config_param :subject, :string
|
42
|
+
config_param :schema_key, :string, :default => "schema"
|
43
|
+
config_param :schema_version, :string, :default => "latest"
|
44
|
+
end
|
36
45
|
|
37
46
|
def configure(conf)
|
38
47
|
super
|
@@ -60,18 +69,20 @@ module Fluent
|
|
60
69
|
@writers_schema = Avro::Schema.parse(@writers_raw_schema)
|
61
70
|
@readers_schema = Avro::Schema.parse(@readers_raw_schema)
|
62
71
|
@reader = Avro::IO::DatumReader.new(@writers_schema, @readers_schema)
|
72
|
+
elsif @avro_registry
|
73
|
+
@confluent_registry = Fluent::Plugin::ConfluentAvroSchemaRegistry.new(@avro_registry.url)
|
74
|
+
@raw_schema = @confluent_registry.subject_version(@avro_registry.subject,
|
75
|
+
@avro_registry.schema_key,
|
76
|
+
@avro_registry.schema_version)
|
77
|
+
@schema = Avro::Schema.parse(@raw_schema)
|
78
|
+
@reader = Avro::IO::DatumReader.new(@schema)
|
63
79
|
else
|
64
|
-
unless [@schema_json, @schema_file, @schema_url
|
80
|
+
unless [@schema_json, @schema_file, @schema_url].compact.size == 1
|
65
81
|
raise Fluent::ConfigError, "schema_json, schema_file, or schema_url is required, but they cannot specify at the same time!"
|
66
82
|
end
|
67
|
-
if @schema_registery_with_subject_url && !@schema_registery_with_subject_url.end_with?("/")
|
68
|
-
raise Fluent::ConfigError, "schema_registery_with_subject_url must contain the trailing slash('/')."
|
69
|
-
end
|
70
83
|
|
71
84
|
@raw_schema = if @schema_file
|
72
85
|
File.read(@schema_file)
|
73
|
-
elsif @schema_registery_with_subject_url
|
74
|
-
fetch_latest_schema(@schema_registery_with_subject_url, @schema_url_key)
|
75
86
|
elsif @schema_url
|
76
87
|
fetch_schema(@schema_url, @schema_url_key)
|
77
88
|
elsif @schema_json
|
@@ -91,25 +102,55 @@ module Fluent
|
|
91
102
|
buffer = StringIO.new(data)
|
92
103
|
decoder = Avro::IO::BinaryDecoder.new(buffer)
|
93
104
|
begin
|
105
|
+
if @use_confluent_schema || @avro_registry
|
106
|
+
# When using confluent avro schema, record is formatted as follows:
|
107
|
+
#
|
108
|
+
# MAGIC_BYTE | schema_id | record
|
109
|
+
# ----------:|:---------:|:---------------
|
110
|
+
# 1byte | 4bytes | record contents
|
111
|
+
magic_byte = decoder.read(1)
|
112
|
+
|
113
|
+
if magic_byte != MAGIC_BYTE
|
114
|
+
raise "The first byte should be magic byte but got {magic_byte.inspect}"
|
115
|
+
end
|
116
|
+
schema_id = decoder.read(4).unpack("N").first
|
117
|
+
end
|
94
118
|
decoded_data = @reader.read(decoder)
|
95
119
|
time, record = convert_values(parse_time(decoded_data), decoded_data)
|
96
120
|
yield time, record
|
97
|
-
rescue => e
|
98
|
-
raise e
|
121
|
+
rescue EOFError, RuntimeError => e
|
122
|
+
raise e unless [@schema_url, @avro_registry].compact.size == 1
|
99
123
|
begin
|
100
124
|
new_raw_schema = if @schema_url
|
101
125
|
fetch_schema(@schema_url, @schema_url_key)
|
102
|
-
elsif @
|
103
|
-
|
126
|
+
elsif @avro_registry
|
127
|
+
@confluent_registry.schema_with_id(schema_id,
|
128
|
+
@avro_registry.schema_key)
|
104
129
|
end
|
105
130
|
new_schema = Avro::Schema.parse(new_raw_schema)
|
106
|
-
is_changed = (
|
131
|
+
is_changed = (new_raw_schema != @raw_schema)
|
107
132
|
@raw_schema = new_raw_schema
|
108
|
-
@
|
109
|
-
rescue
|
133
|
+
@schema = new_schema
|
134
|
+
rescue EOFError, RuntimeError
|
110
135
|
# Do nothing.
|
111
136
|
end
|
112
137
|
if is_changed
|
138
|
+
buffer = StringIO.new(data)
|
139
|
+
decoder = Avro::IO::BinaryDecoder.new(buffer)
|
140
|
+
if @use_confluent_schema || @avro_registry
|
141
|
+
# When using confluent avro schema, record is formatted as follows:
|
142
|
+
#
|
143
|
+
# MAGIC_BYTE | schema_id | record
|
144
|
+
# ----------:|:---------:|:---------------
|
145
|
+
# 1byte | 4bytes | record contents
|
146
|
+
magic_byte = decoder.read(1)
|
147
|
+
|
148
|
+
if magic_byte != MAGIC_BYTE
|
149
|
+
raise "The first byte should be magic byte but got {magic_byte.inspect}"
|
150
|
+
end
|
151
|
+
schema_id = decoder.read(4).unpack("N").first
|
152
|
+
end
|
153
|
+
@reader = Avro::IO::DatumReader.new(@schema)
|
113
154
|
decoded_data = @reader.read(decoder)
|
114
155
|
time, record = convert_values(parse_time(decoded_data), decoded_data)
|
115
156
|
yield time, record
|
@@ -119,24 +160,6 @@ module Fluent
|
|
119
160
|
end
|
120
161
|
end
|
121
162
|
|
122
|
-
def fetch_schema_versions(base_uri_with_versions)
|
123
|
-
versions_response = Net::HTTP.get_response(base_uri_with_versions)
|
124
|
-
Yajl.load(versions_response.body)
|
125
|
-
end
|
126
|
-
|
127
|
-
def fetch_latest_schema(base_url, schema_key)
|
128
|
-
base_uri = URI.parse(base_url)
|
129
|
-
base_uri_with_versions = URI.join(base_uri, "versions/")
|
130
|
-
versions = fetch_schema_versions(base_uri_with_versions)
|
131
|
-
uri = URI.join(base_uri_with_versions, versions.last.to_s)
|
132
|
-
response = Net::HTTP.get_response(uri)
|
133
|
-
if schema_key.nil?
|
134
|
-
response.body
|
135
|
-
else
|
136
|
-
Yajl.load(response.body)[schema_key]
|
137
|
-
end
|
138
|
-
end
|
139
|
-
|
140
163
|
def fetch_schema(url, schema_key)
|
141
164
|
uri = URI.parse(url)
|
142
165
|
response = Net::HTTP.get_response(uri)
|
@@ -0,0 +1 @@
|
|
1
|
+
{"schema":"{\"type\":\"record\",\"name\":\"Person\",\"namespace\":\"com.ippontech.kafkatutorials\",\"fields\":[{\"name\":\"firstName\",\"type\":\"string\"},{\"name\":\"lastName\",\"type\":\"string\"},{\"name\":\"birthDate\",\"type\":\"long\"}]}"}
|
@@ -0,0 +1 @@
|
|
1
|
+
{"schema":"{\"type\":\"record\",\"name\":\"Person\",\"namespace\":\"com.ippontech.kafkatutorials\",\"fields\":[{\"name\":\"firstName\",\"type\":\"string\"},{\"name\":\"lastName\",\"type\":\"string\"},{\"name\":\"birthDate\",\"type\":\"long\"},{\"name\":\"verified\",\"type\":\"boolean\",\"default\":false}]}"}
|
@@ -0,0 +1 @@
|
|
1
|
+
{"schema":"{\"type\":\"record\",\"name\":\"Person\",\"namespace\":\"com.ippontech.kafkatutorials\",\"fields\":[{\"name\":\"firstName\",\"type\":\"string\"},{\"name\":\"lastName\",\"type\":\"string\"},{\"name\":\"birthDate\",\"type\":\"long\"},{\"name\":\"verified\",\"type\":\"boolean\"}]}"}
|
@@ -0,0 +1 @@
|
|
1
|
+
{"schema":"{\"type\":\"record\",\"name\":\"Person\",\"namespace\":\"com.ippontech.kafkatutorials\",\"fields\":[{\"name\":\"firstName\",\"type\":\"string\"},{\"name\":\"lastName\",\"type\":\"string\"},{\"name\":\"birthDate\",\"type\":\"long\"},{\"name\":\"verified\",\"type\":[\"boolean\",\"null\"],\"default\":false}]}"}
|
@@ -76,74 +76,94 @@ class AvroParserTest < Test::Unit::TestCase
|
|
76
76
|
}
|
77
77
|
EOC
|
78
78
|
|
79
|
-
|
79
|
+
data("use_confluent_schema" => true,
|
80
|
+
"plain" => false)
|
81
|
+
def test_parse(data)
|
82
|
+
config = data
|
80
83
|
conf = {
|
81
|
-
'schema_json' => SCHEMA
|
84
|
+
'schema_json' => SCHEMA,
|
85
|
+
'use_confluent_schema' => config,
|
82
86
|
}
|
83
87
|
d = create_driver(conf)
|
84
88
|
datum = {"username" => "foo", "age" => 42, "verified" => true}
|
85
|
-
encoded = encode_datum(datum, SCHEMA)
|
89
|
+
encoded = encode_datum(datum, SCHEMA, config)
|
86
90
|
d.instance.parse(encoded) do |_time, record|
|
87
91
|
assert_equal datum, record
|
88
92
|
end
|
89
93
|
|
90
94
|
datum = {"username" => "baz", "age" => 34}
|
91
|
-
encoded = encode_datum(datum, SCHEMA)
|
95
|
+
encoded = encode_datum(datum, SCHEMA, config)
|
92
96
|
d.instance.parse(encoded) do |_time, record|
|
93
97
|
assert_equal datum.merge("verified" => nil), record
|
94
98
|
end
|
95
99
|
end
|
96
100
|
|
97
|
-
|
101
|
+
data("use_confluent_schema" => true,
|
102
|
+
"plain" => false)
|
103
|
+
def test_parse_with_avro_schema(data)
|
104
|
+
config = data
|
98
105
|
conf = {
|
99
|
-
'schema_file' => File.join(__dir__, "..", "data", "user.avsc")
|
106
|
+
'schema_file' => File.join(__dir__, "..", "data", "user.avsc"),
|
107
|
+
'use_confluent_schema' => config,
|
100
108
|
}
|
101
109
|
d = create_driver(conf)
|
102
110
|
datum = {"username" => "foo", "age" => 42, "verified" => true}
|
103
|
-
encoded = encode_datum(datum, SCHEMA)
|
111
|
+
encoded = encode_datum(datum, SCHEMA, config)
|
104
112
|
d.instance.parse(encoded) do |_time, record|
|
105
113
|
assert_equal datum, record
|
106
114
|
end
|
107
115
|
|
108
116
|
datum = {"username" => "baz", "age" => 34}
|
109
|
-
encoded = encode_datum(datum, SCHEMA)
|
117
|
+
encoded = encode_datum(datum, SCHEMA, config)
|
110
118
|
d.instance.parse(encoded) do |_time, record|
|
111
119
|
assert_equal datum.merge("verified" => nil), record
|
112
120
|
end
|
113
121
|
end
|
114
122
|
|
115
|
-
|
123
|
+
data("use_confluent_schema" => true,
|
124
|
+
"plain" => false)
|
125
|
+
def test_parse_with_readers_and_writers_schema(data)
|
126
|
+
config = data
|
116
127
|
conf = {
|
117
128
|
'writers_schema_json' => SCHEMA,
|
118
129
|
'readers_schema_json' => READERS_SCHEMA,
|
130
|
+
'use_confluent_schema' => config,
|
119
131
|
}
|
120
132
|
d = create_driver(conf)
|
121
133
|
datum = {"username" => "foo", "age" => 42, "verified" => true}
|
122
|
-
encoded = encode_datum(datum, SCHEMA)
|
134
|
+
encoded = encode_datum(datum, SCHEMA, config)
|
123
135
|
d.instance.parse(encoded) do |_time, record|
|
124
136
|
datum.delete("verified")
|
125
137
|
assert_equal datum, record
|
126
138
|
end
|
127
139
|
end
|
128
140
|
|
129
|
-
|
141
|
+
data("use_confluent_schema" => true,
|
142
|
+
"plain" => false)
|
143
|
+
def test_parse_with_readers_and_writers_schema_files(data)
|
144
|
+
config = data
|
130
145
|
conf = {
|
131
146
|
'writers_schema_file' => File.join(__dir__, "..", "data", "writer_user.avsc"),
|
132
147
|
'readers_schema_file' => File.join(__dir__, "..", "data", "reader_user.avsc"),
|
148
|
+
'use_confluent_schema' => config,
|
133
149
|
}
|
134
150
|
d = create_driver(conf)
|
135
151
|
datum = {"username" => "foo", "age" => 42, "verified" => true}
|
136
|
-
encoded = encode_datum(datum, SCHEMA)
|
152
|
+
encoded = encode_datum(datum, SCHEMA, config)
|
137
153
|
d.instance.parse(encoded) do |_time, record|
|
138
154
|
datum.delete("verified")
|
139
155
|
assert_equal datum, record
|
140
156
|
end
|
141
157
|
end
|
142
158
|
|
143
|
-
|
159
|
+
data("use_confluent_schema" => true,
|
160
|
+
"plain" => false)
|
161
|
+
def test_parse_with_complex_schema(data)
|
162
|
+
config = data
|
144
163
|
conf = {
|
145
164
|
'schema_json' => COMPLEX_SCHEMA,
|
146
|
-
'time_key' => 'time'
|
165
|
+
'time_key' => 'time',
|
166
|
+
'use_confluent_schema' => config,
|
147
167
|
}
|
148
168
|
d = create_driver(conf)
|
149
169
|
time_str = "2020-09-25 15:08:09.082113 +0900"
|
@@ -162,7 +182,7 @@ class AvroParserTest < Test::Unit::TestCase
|
|
162
182
|
}
|
163
183
|
}
|
164
184
|
|
165
|
-
encoded = encode_datum(datum, COMPLEX_SCHEMA)
|
185
|
+
encoded = encode_datum(datum, COMPLEX_SCHEMA, config)
|
166
186
|
d.instance.parse(encoded) do |time, record|
|
167
187
|
assert_equal Time.parse(time_str).to_r, time.to_r
|
168
188
|
datum.delete("time")
|
@@ -185,6 +205,22 @@ class AvroParserTest < Test::Unit::TestCase
|
|
185
205
|
res.status = 200
|
186
206
|
res.body = 'running'
|
187
207
|
end
|
208
|
+
server.mount_proc("/schemas/ids") do |req, res|
|
209
|
+
req.path =~ /^\/schemas\/ids\/([^\/]*)$/
|
210
|
+
version = $1
|
211
|
+
@got.push({
|
212
|
+
version: version,
|
213
|
+
})
|
214
|
+
if version == "1"
|
215
|
+
res.body = File.read(File.join(__dir__, "..", "data", "schema-persions-value-1.avsc"))
|
216
|
+
elsif version == "21"
|
217
|
+
res.body = File.read(File.join(__dir__, "..", "data", "schema-persions-value-21.avsc"))
|
218
|
+
elsif version == "41"
|
219
|
+
res.body = File.read(File.join(__dir__, "..", "data", "schema-persions-value-41.avsc"))
|
220
|
+
elsif version == "42"
|
221
|
+
res.body = File.read(File.join(__dir__, "..", "data", "schema-persions-value-42.avsc"))
|
222
|
+
end
|
223
|
+
end
|
188
224
|
server.mount_proc("/subjects") do |req, res|
|
189
225
|
req.path =~ /^\/subjects\/([^\/]*)\/([^\/]*)\/(.*)$/
|
190
226
|
avro_registered_name = $1
|
@@ -204,6 +240,8 @@ class AvroParserTest < Test::Unit::TestCase
|
|
204
240
|
res.body = File.read(File.join(__dir__, "..", "data", "persons-avro-value3.avsc"))
|
205
241
|
elsif version == "4"
|
206
242
|
res.body = File.read(File.join(__dir__, "..", "data", "persons-avro-value4.avsc"))
|
243
|
+
elsif version == "latest"
|
244
|
+
res.body = File.read(File.join(__dir__, "..", "data", "persons-avro-value4.avsc"))
|
207
245
|
end
|
208
246
|
end
|
209
247
|
server.start
|
@@ -318,63 +356,109 @@ class AvroParserTest < Test::Unit::TestCase
|
|
318
356
|
assert_equal 4, @got.size
|
319
357
|
assert_equal 'persons-avro-value', @got[3][:registered_name]
|
320
358
|
assert_equal '3', @got[3][:version]
|
359
|
+
|
360
|
+
assert_equal '200', client.request_get('/schemas/ids/1').code
|
361
|
+
assert_equal 5, @got.size
|
362
|
+
assert_nil @got[4][:registered_name]
|
363
|
+
assert_equal '1', @got[4][:version]
|
364
|
+
|
365
|
+
assert_equal '200', client.request_get('/schemas/ids/21').code
|
366
|
+
assert_equal 6, @got.size
|
367
|
+
assert_nil @got[5][:registered_name]
|
368
|
+
assert_equal '21', @got[5][:version]
|
369
|
+
|
370
|
+
assert_equal '200', client.request_get('/schemas/ids/41').code
|
371
|
+
assert_equal 7, @got.size
|
372
|
+
assert_nil @got[6][:registered_name]
|
373
|
+
assert_equal '41', @got[6][:version]
|
374
|
+
|
375
|
+
assert_equal '200', client.request_get('/schemas/ids/42').code
|
376
|
+
assert_equal 8, @got.size
|
377
|
+
assert_nil @got[7][:registered_name]
|
378
|
+
assert_equal '42', @got[7][:version]
|
321
379
|
end
|
322
380
|
|
323
|
-
|
381
|
+
data("use_confluent_schema" => true,
|
382
|
+
"plain" => false)
|
383
|
+
def test_schema_url(data)
|
384
|
+
config = data
|
324
385
|
conf = {
|
325
386
|
'schema_url' => "http://localhost:8081/subjects/persons-avro-value/versions/1",
|
326
|
-
'schema_url_key' => 'schema'
|
387
|
+
'schema_url_key' => 'schema',
|
388
|
+
'use_confluent_schema' => config,
|
327
389
|
}
|
328
390
|
d = create_driver(conf)
|
329
391
|
datum = {"firstName" => "Aleen","lastName" => "Terry","birthDate" => 159202477258}
|
330
|
-
encoded = encode_datum(datum, REMOTE_SCHEMA)
|
392
|
+
encoded = encode_datum(datum, REMOTE_SCHEMA, config)
|
331
393
|
d.instance.parse(encoded) do |_time, record|
|
332
394
|
assert_equal datum, record
|
333
395
|
end
|
334
396
|
end
|
335
397
|
|
336
|
-
|
398
|
+
data("use_confluent_schema" => true,
|
399
|
+
"plain" => false)
|
400
|
+
def test_schema_url_with_version2(data)
|
401
|
+
config = data
|
337
402
|
conf = {
|
338
403
|
'schema_url' => "http://localhost:8081/subjects/persons-avro-value/versions/2",
|
339
|
-
'schema_url_key' => 'schema'
|
404
|
+
'schema_url_key' => 'schema',
|
405
|
+
'use_confluent_schema' => config,
|
340
406
|
}
|
341
407
|
d = create_driver(conf)
|
342
408
|
datum = {"firstName" => "Aleen","lastName" => "Terry","birthDate" => 159202477258}
|
343
|
-
encoded = encode_datum(datum, REMOTE_SCHEMA2)
|
409
|
+
encoded = encode_datum(datum, REMOTE_SCHEMA2, config)
|
344
410
|
d.instance.parse(encoded) do |_time, record|
|
345
411
|
assert_equal datum.merge("verified" => false), record
|
346
412
|
end
|
347
413
|
end
|
348
414
|
|
349
|
-
def
|
350
|
-
conf =
|
351
|
-
'
|
352
|
-
|
353
|
-
|
415
|
+
def test_confluent_registry_with_schema_version
|
416
|
+
conf = Fluent::Config::Element.new(
|
417
|
+
'', '', {'@type' => 'avro'}, [
|
418
|
+
Fluent::Config::Element.new('confluent_registry', '', {
|
419
|
+
'url' => 'http://localhost:8081',
|
420
|
+
'subject' => 'persons-avro-value',
|
421
|
+
'schema_key' => 'schema',
|
422
|
+
'schema_version' => '1',
|
423
|
+
}, [])
|
424
|
+
])
|
354
425
|
d = create_driver(conf)
|
355
426
|
datum = {"firstName" => "Aleen","lastName" => "Terry","birthDate" => 159202477258}
|
356
|
-
|
427
|
+
schema = Yajl.load(File.read(File.join(__dir__, "..", "data", "schema-persions-value-1.avsc")))
|
428
|
+
encoded = encode_datum(datum, schema.fetch("schema"), true, 1)
|
357
429
|
d.instance.parse(encoded) do |_time, record|
|
358
|
-
assert_equal datum
|
430
|
+
assert_equal datum, record
|
359
431
|
end
|
360
432
|
end
|
361
433
|
|
362
|
-
def
|
363
|
-
conf =
|
364
|
-
'
|
365
|
-
|
366
|
-
|
367
|
-
|
368
|
-
|
434
|
+
def test_confluent_registry_with_fallback
|
435
|
+
conf = Fluent::Config::Element.new(
|
436
|
+
'', '', {'@type' => 'avro'}, [
|
437
|
+
Fluent::Config::Element.new('confluent_registry', '', {
|
438
|
+
'url' => 'http://localhost:8081',
|
439
|
+
'subject' => 'persons-avro-value',
|
440
|
+
'schema_key' => 'schema',
|
441
|
+
}, [])
|
442
|
+
])
|
443
|
+
d = create_driver(conf)
|
444
|
+
datum = {"firstName" => "Aleen","lastName" => "Terry","birthDate" => 159202477258}
|
445
|
+
schema = Yajl.load(File.read(File.join(__dir__, "..", "data", "schema-persions-value-1.avsc")))
|
446
|
+
encoded = encode_datum(datum, schema.fetch("schema"), true, 1)
|
447
|
+
d.instance.parse(encoded) do |_time, record|
|
448
|
+
assert_equal datum, record
|
369
449
|
end
|
370
450
|
end
|
371
451
|
end
|
372
452
|
|
373
453
|
private
|
374
454
|
|
375
|
-
def encode_datum(datum, string_schema)
|
455
|
+
def encode_datum(datum, string_schema, use_confluent_schema = true, schema_id = 1)
|
376
456
|
buffer = StringIO.new
|
377
457
|
encoder = Avro::IO::BinaryEncoder.new(buffer)
|
458
|
+
if use_confluent_schema
|
459
|
+
encoder.write(Fluent::Plugin::AvroParser::MAGIC_BYTE)
|
460
|
+
encoder.write([schema_id].pack("N"))
|
461
|
+
end
|
378
462
|
schema = Avro::Schema.parse(string_schema)
|
379
463
|
writer = Avro::IO::DatumWriter.new(schema)
|
380
464
|
writer.write(datum, encoder)
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: fluent-plugin-parser-avro
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.2.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Hiroshi Hatake
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2020-09-
|
11
|
+
date: 2020-09-30 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: avro
|
@@ -100,13 +100,18 @@ files:
|
|
100
100
|
- LICENSE
|
101
101
|
- README.md
|
102
102
|
- Rakefile
|
103
|
-
- fluent-plugin-avro.gemspec
|
103
|
+
- fluent-plugin-parser-avro.gemspec
|
104
|
+
- lib/fluent/plugin/confluent_avro_schema_registry.rb
|
104
105
|
- lib/fluent/plugin/parser_avro.rb
|
105
106
|
- test/data/persons-avro-value.avsc
|
106
107
|
- test/data/persons-avro-value2.avsc
|
107
108
|
- test/data/persons-avro-value3.avsc
|
108
109
|
- test/data/persons-avro-value4.avsc
|
109
110
|
- test/data/reader_user.avsc
|
111
|
+
- test/data/schema-persions-value-1.avsc
|
112
|
+
- test/data/schema-persions-value-21.avsc
|
113
|
+
- test/data/schema-persions-value-41.avsc
|
114
|
+
- test/data/schema-persions-value-42.avsc
|
110
115
|
- test/data/user.avsc
|
111
116
|
- test/data/writer_user.avsc
|
112
117
|
- test/helper.rb
|
@@ -140,6 +145,10 @@ test_files:
|
|
140
145
|
- test/data/persons-avro-value3.avsc
|
141
146
|
- test/data/persons-avro-value4.avsc
|
142
147
|
- test/data/reader_user.avsc
|
148
|
+
- test/data/schema-persions-value-1.avsc
|
149
|
+
- test/data/schema-persions-value-21.avsc
|
150
|
+
- test/data/schema-persions-value-41.avsc
|
151
|
+
- test/data/schema-persions-value-42.avsc
|
143
152
|
- test/data/user.avsc
|
144
153
|
- test/data/writer_user.avsc
|
145
154
|
- test/helper.rb
|