fluent-plugin-parser-avro 0.1.0 → 0.2.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/LICENSE +1 -1
- data/README.md +46 -7
- data/{fluent-plugin-avro.gemspec → fluent-plugin-parser-avro.gemspec} +1 -1
- data/lib/fluent/plugin/confluent_avro_schema_registry.rb +49 -0
- data/lib/fluent/plugin/parser_avro.rb +55 -32
- data/test/data/schema-persions-value-1.avsc +1 -0
- data/test/data/schema-persions-value-21.avsc +1 -0
- data/test/data/schema-persions-value-41.avsc +1 -0
- data/test/data/schema-persions-value-42.avsc +1 -0
- data/test/plugin/test_parser_avro.rb +120 -36
- metadata +12 -3
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 9633a316c1de1e4cb83b487d99d5a3814f1cde2b96d20633cfe728c584108761
|
4
|
+
data.tar.gz: 6438fe5e248c98540602183a5828a7dbc596fe5000c4993fd428ee9223ed3077
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 8b8894c17aaa33916ef3601634d36139f2817db875eba2c5e552a0309f2c5f3ce6578526c5aad54244cd51ce36cf30bc04f64ce576fcbd4ccb89fc7726d4102d
|
7
|
+
data.tar.gz: a9dbc9e144b1e0f7f2b585a51c0519e6c2953d7258b5bd13fd8d6246754f9956bffcee43777378692ea4ef2d8997d8f6648653d8928e2345e70c32e48ab849a6
|
data/LICENSE
CHANGED
@@ -187,7 +187,7 @@
|
|
187
187
|
same "printed page" as the copyright notice for easier
|
188
188
|
identification within third-party archives.
|
189
189
|
|
190
|
-
Copyright
|
190
|
+
Copyright [yyyy] [name of copyright owner]
|
191
191
|
|
192
192
|
Licensed under the Apache License, Version 2.0 (the "License");
|
193
193
|
you may not use this file except in compliance with the License.
|
data/README.md
CHANGED
@@ -32,12 +32,22 @@ $ bundle
|
|
32
32
|
* **schema_file** (string) (optional): avro schema file path.
|
33
33
|
* **schema_json** (string) (optional): avro schema definition hash.
|
34
34
|
* **schema_url** (string) (optional): avro schema remote URL.
|
35
|
-
* **schema_registery_with_subject_url** (string) (optional): avro schema registry URL.
|
36
35
|
* **schema_url_key** (string) (optional): avro schema registry or something's response schema key.
|
37
36
|
* **writers_schema_file** (string) (optional): avro schema file path for writers definition.
|
38
37
|
* **writers_schema_json** (string) (optional): avro schema definition hash for writers definition.
|
39
38
|
* **readers_schema_file** (string) (optional): avro schema file path for readers definition.
|
40
39
|
* **readers_schema_json** (string) (optional): avro schema definition hash for readers definition.
|
40
|
+
* **use_confluent_schema** (bool) (optional): Assume to use confluent schema. Confluent avro schema uses the first 5-bytes for magic byte (1 byte) and schema_id (4 bytes). This parameter specifies to skip reading the first 5-bytes or not.
|
41
|
+
* Default value: `true`.
|
42
|
+
|
43
|
+
### \<confluent_registry\> section (optional) (single)
|
44
|
+
|
45
|
+
* **url** (string) (required): confluent schema registry URL.
|
46
|
+
* **subject** (string) (required): Specify schema subject.
|
47
|
+
* **schema_key** (string) (optional): Specify schema key on confluent registry REST API response.
|
48
|
+
* Default value: `schema`.
|
49
|
+
* **schema_version** (string) (optional): Specify schema version for the specified subject.
|
50
|
+
* Default value: `latest`.
|
41
51
|
|
42
52
|
### Configuration Example
|
43
53
|
|
@@ -48,7 +58,14 @@ $ bundle
|
|
48
58
|
# schema_json { "namespace": "org.fluentd.parser.avro", "type": "record", "name": "User", "fields" : [{"name": "username", "type": "string"}, {"name": "age", "type": "int"}, {"name": "verified", "type": ["boolean", "null"], "default": false}]}
|
49
59
|
# schema_url http(s)://[server fqdn]:[port]/subjects/[a great user's subject]/[the latest schema version]
|
50
60
|
# schema_key schema
|
51
|
-
#
|
61
|
+
# When using with confluent registry without <confluent_registry>, this parameter must be true.
|
62
|
+
# use_confluent_schema true
|
63
|
+
#<confluent_registry>
|
64
|
+
# url http://localhost:8081/
|
65
|
+
# subject your-awesome-subject
|
66
|
+
# # schema_key schema
|
67
|
+
# # schema_version 1
|
68
|
+
#</confluent_registry>
|
52
69
|
</parse>
|
53
70
|
```
|
54
71
|
|
@@ -58,22 +75,44 @@ Confluent AVRO schema registry should respond with REST API.
|
|
58
75
|
|
59
76
|
This plugin uses the following API:
|
60
77
|
|
61
|
-
* [`GET /subjects/(string: subject)/versions`](https://docs.confluent.io/current/schema-registry/develop/api.html#get--subjects-(string-%20subject)-versions)
|
62
78
|
* [`GET /subjects/(string: subject)/versions/(versionId: version)`](https://docs.confluent.io/current/schema-registry/develop/api.html#get--subjects-(string-%20subject)-versions)
|
63
79
|
|
64
|
-
Users can specify a URL for retrieving the latest schemna information
|
80
|
+
Users can specify a URL for retrieving the latest schemna information with `<confluent_registry>`:
|
65
81
|
|
66
|
-
e.g.)
|
82
|
+
e.g.)
|
83
|
+
```
|
84
|
+
<confluent_registry>
|
85
|
+
url http://[confluent registry server ip]:[port]/
|
86
|
+
subject your-awesome-subject
|
87
|
+
# schema_key schema
|
88
|
+
# schema_version 1
|
89
|
+
</confluent_registry>
|
90
|
+
```
|
67
91
|
|
68
92
|
For example, when specifying the following configuration:
|
69
93
|
|
70
94
|
```
|
71
95
|
<parse>
|
72
96
|
@type avro
|
73
|
-
|
97
|
+
<confluent_registry>
|
98
|
+
url http://localhost:8081/
|
99
|
+
subject persons-avro-value
|
100
|
+
# schema_key schema
|
101
|
+
# schema_version 1
|
102
|
+
</confluent_registry>
|
74
103
|
```
|
75
104
|
|
76
|
-
Then the parser plugin calls `GET http://localhost:8081/subjects/persons-avro-value/versions
|
105
|
+
Then the parser plugin calls `GET http://localhost:8081/subjects/persons-avro-value/versions/latest` to retrive the registered schema versions. And when parsing failure occurred, this plugin will call `GET http://localhost:8081/schemas/ids/<schema id which is obtained from the second record on avro schema>`.
|
106
|
+
|
107
|
+
If you use this plugin to parse confluent schema, please specify `use_confluent_schema` as `true`.
|
108
|
+
|
109
|
+
This is because, confluent avro schema uses the following structure:
|
110
|
+
|
111
|
+
MAGIC_BYTE | schema_id | record
|
112
|
+
----------:|:---------:|:---------------
|
113
|
+
1byte | 4bytes | record contents
|
114
|
+
|
115
|
+
When specifying `<confluent_registry>` section on configuration, this plugin will skip to read the first 5-bytes automatically and parse `schema_id` from there.
|
77
116
|
|
78
117
|
## Copyright
|
79
118
|
|
@@ -0,0 +1,49 @@
|
|
1
|
+
#
|
2
|
+
# Copyright 2020- Hiroshi Hatake
|
3
|
+
#
|
4
|
+
# Licensed under the Apache License, Version 2.0 (the "License");
|
5
|
+
# you may not use this file except in compliance with the License.
|
6
|
+
# You may obtain a copy of the License at
|
7
|
+
#
|
8
|
+
# http://www.apache.org/licenses/LICENSE-2.0
|
9
|
+
#
|
10
|
+
# Unless required by applicable law or agreed to in writing, software
|
11
|
+
# distributed under the License is distributed on an "AS IS" BASIS,
|
12
|
+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
13
|
+
# See the License for the specific language governing permissions and
|
14
|
+
# limitations under the License.
|
15
|
+
|
16
|
+
require "net/http"
|
17
|
+
require "uri"
|
18
|
+
|
19
|
+
module Fluent
|
20
|
+
module Plugin
|
21
|
+
class ConfluentAvroSchemaRegistry
|
22
|
+
def initialize(registry_url)
|
23
|
+
@registry_url = registry_url
|
24
|
+
end
|
25
|
+
|
26
|
+
def subject_version(subject, schema_key, version = "latest")
|
27
|
+
registry_uri = URI.parse(@registry_url)
|
28
|
+
registry_uri_with_versions = URI.join(registry_uri, "/subjects/#{subject}/versions/#{version}")
|
29
|
+
response = Net::HTTP.get_response(registry_uri_with_versions)
|
30
|
+
if schema_key.nil?
|
31
|
+
response.body
|
32
|
+
else
|
33
|
+
Yajl.load(response.body)[schema_key]
|
34
|
+
end
|
35
|
+
end
|
36
|
+
|
37
|
+
def schema_with_id(schema_id, schema_key)
|
38
|
+
registry_uri = URI.parse(@registry_url)
|
39
|
+
registry_uri_with_ids = URI.join(registry_uri, "/schemas/ids/#{schema_id}")
|
40
|
+
response = Net::HTTP.get_response(registry_uri_with_ids)
|
41
|
+
if schema_key.nil?
|
42
|
+
response.body
|
43
|
+
else
|
44
|
+
Yajl.load(response.body)[schema_key]
|
45
|
+
end
|
46
|
+
end
|
47
|
+
end
|
48
|
+
end
|
49
|
+
end
|
@@ -18,21 +18,30 @@ require "net/http"
|
|
18
18
|
require "stringio"
|
19
19
|
require "uri"
|
20
20
|
require "fluent/plugin/parser"
|
21
|
+
require_relative "./confluent_avro_schema_registry"
|
21
22
|
|
22
23
|
module Fluent
|
23
24
|
module Plugin
|
24
25
|
class AvroParser < Fluent::Plugin::Parser
|
25
26
|
Fluent::Plugin.register_parser("avro", self)
|
26
27
|
|
28
|
+
MAGIC_BYTE = [0].pack("C").freeze
|
29
|
+
|
27
30
|
config_param :schema_file, :string, :default => nil
|
28
31
|
config_param :schema_json, :string, :default => nil
|
29
32
|
config_param :schema_url, :string, :default => nil
|
30
|
-
config_param :schema_registery_with_subject_url, :string, :default => nil
|
31
33
|
config_param :schema_url_key, :string, :default => nil
|
32
34
|
config_param :writers_schema_file, :string, :default => nil
|
33
35
|
config_param :writers_schema_json, :string, :default => nil
|
34
36
|
config_param :readers_schema_file, :string, :default => nil
|
35
37
|
config_param :readers_schema_json, :string, :default => nil
|
38
|
+
config_param :use_confluent_schema, :bool, :default => true
|
39
|
+
config_section :confluent_registry, param_name: :avro_registry, required: false, multi: false do
|
40
|
+
config_param :url, :string
|
41
|
+
config_param :subject, :string
|
42
|
+
config_param :schema_key, :string, :default => "schema"
|
43
|
+
config_param :schema_version, :string, :default => "latest"
|
44
|
+
end
|
36
45
|
|
37
46
|
def configure(conf)
|
38
47
|
super
|
@@ -60,18 +69,20 @@ module Fluent
|
|
60
69
|
@writers_schema = Avro::Schema.parse(@writers_raw_schema)
|
61
70
|
@readers_schema = Avro::Schema.parse(@readers_raw_schema)
|
62
71
|
@reader = Avro::IO::DatumReader.new(@writers_schema, @readers_schema)
|
72
|
+
elsif @avro_registry
|
73
|
+
@confluent_registry = Fluent::Plugin::ConfluentAvroSchemaRegistry.new(@avro_registry.url)
|
74
|
+
@raw_schema = @confluent_registry.subject_version(@avro_registry.subject,
|
75
|
+
@avro_registry.schema_key,
|
76
|
+
@avro_registry.schema_version)
|
77
|
+
@schema = Avro::Schema.parse(@raw_schema)
|
78
|
+
@reader = Avro::IO::DatumReader.new(@schema)
|
63
79
|
else
|
64
|
-
unless [@schema_json, @schema_file, @schema_url
|
80
|
+
unless [@schema_json, @schema_file, @schema_url].compact.size == 1
|
65
81
|
raise Fluent::ConfigError, "schema_json, schema_file, or schema_url is required, but they cannot specify at the same time!"
|
66
82
|
end
|
67
|
-
if @schema_registery_with_subject_url && !@schema_registery_with_subject_url.end_with?("/")
|
68
|
-
raise Fluent::ConfigError, "schema_registery_with_subject_url must contain the trailing slash('/')."
|
69
|
-
end
|
70
83
|
|
71
84
|
@raw_schema = if @schema_file
|
72
85
|
File.read(@schema_file)
|
73
|
-
elsif @schema_registery_with_subject_url
|
74
|
-
fetch_latest_schema(@schema_registery_with_subject_url, @schema_url_key)
|
75
86
|
elsif @schema_url
|
76
87
|
fetch_schema(@schema_url, @schema_url_key)
|
77
88
|
elsif @schema_json
|
@@ -91,25 +102,55 @@ module Fluent
|
|
91
102
|
buffer = StringIO.new(data)
|
92
103
|
decoder = Avro::IO::BinaryDecoder.new(buffer)
|
93
104
|
begin
|
105
|
+
if @use_confluent_schema || @avro_registry
|
106
|
+
# When using confluent avro schema, record is formatted as follows:
|
107
|
+
#
|
108
|
+
# MAGIC_BYTE | schema_id | record
|
109
|
+
# ----------:|:---------:|:---------------
|
110
|
+
# 1byte | 4bytes | record contents
|
111
|
+
magic_byte = decoder.read(1)
|
112
|
+
|
113
|
+
if magic_byte != MAGIC_BYTE
|
114
|
+
raise "The first byte should be magic byte but got {magic_byte.inspect}"
|
115
|
+
end
|
116
|
+
schema_id = decoder.read(4).unpack("N").first
|
117
|
+
end
|
94
118
|
decoded_data = @reader.read(decoder)
|
95
119
|
time, record = convert_values(parse_time(decoded_data), decoded_data)
|
96
120
|
yield time, record
|
97
|
-
rescue => e
|
98
|
-
raise e
|
121
|
+
rescue EOFError, RuntimeError => e
|
122
|
+
raise e unless [@schema_url, @avro_registry].compact.size == 1
|
99
123
|
begin
|
100
124
|
new_raw_schema = if @schema_url
|
101
125
|
fetch_schema(@schema_url, @schema_url_key)
|
102
|
-
elsif @
|
103
|
-
|
126
|
+
elsif @avro_registry
|
127
|
+
@confluent_registry.schema_with_id(schema_id,
|
128
|
+
@avro_registry.schema_key)
|
104
129
|
end
|
105
130
|
new_schema = Avro::Schema.parse(new_raw_schema)
|
106
|
-
is_changed = (
|
131
|
+
is_changed = (new_raw_schema != @raw_schema)
|
107
132
|
@raw_schema = new_raw_schema
|
108
|
-
@
|
109
|
-
rescue
|
133
|
+
@schema = new_schema
|
134
|
+
rescue EOFError, RuntimeError
|
110
135
|
# Do nothing.
|
111
136
|
end
|
112
137
|
if is_changed
|
138
|
+
buffer = StringIO.new(data)
|
139
|
+
decoder = Avro::IO::BinaryDecoder.new(buffer)
|
140
|
+
if @use_confluent_schema || @avro_registry
|
141
|
+
# When using confluent avro schema, record is formatted as follows:
|
142
|
+
#
|
143
|
+
# MAGIC_BYTE | schema_id | record
|
144
|
+
# ----------:|:---------:|:---------------
|
145
|
+
# 1byte | 4bytes | record contents
|
146
|
+
magic_byte = decoder.read(1)
|
147
|
+
|
148
|
+
if magic_byte != MAGIC_BYTE
|
149
|
+
raise "The first byte should be magic byte but got {magic_byte.inspect}"
|
150
|
+
end
|
151
|
+
schema_id = decoder.read(4).unpack("N").first
|
152
|
+
end
|
153
|
+
@reader = Avro::IO::DatumReader.new(@schema)
|
113
154
|
decoded_data = @reader.read(decoder)
|
114
155
|
time, record = convert_values(parse_time(decoded_data), decoded_data)
|
115
156
|
yield time, record
|
@@ -119,24 +160,6 @@ module Fluent
|
|
119
160
|
end
|
120
161
|
end
|
121
162
|
|
122
|
-
def fetch_schema_versions(base_uri_with_versions)
|
123
|
-
versions_response = Net::HTTP.get_response(base_uri_with_versions)
|
124
|
-
Yajl.load(versions_response.body)
|
125
|
-
end
|
126
|
-
|
127
|
-
def fetch_latest_schema(base_url, schema_key)
|
128
|
-
base_uri = URI.parse(base_url)
|
129
|
-
base_uri_with_versions = URI.join(base_uri, "versions/")
|
130
|
-
versions = fetch_schema_versions(base_uri_with_versions)
|
131
|
-
uri = URI.join(base_uri_with_versions, versions.last.to_s)
|
132
|
-
response = Net::HTTP.get_response(uri)
|
133
|
-
if schema_key.nil?
|
134
|
-
response.body
|
135
|
-
else
|
136
|
-
Yajl.load(response.body)[schema_key]
|
137
|
-
end
|
138
|
-
end
|
139
|
-
|
140
163
|
def fetch_schema(url, schema_key)
|
141
164
|
uri = URI.parse(url)
|
142
165
|
response = Net::HTTP.get_response(uri)
|
@@ -0,0 +1 @@
|
|
1
|
+
{"schema":"{\"type\":\"record\",\"name\":\"Person\",\"namespace\":\"com.ippontech.kafkatutorials\",\"fields\":[{\"name\":\"firstName\",\"type\":\"string\"},{\"name\":\"lastName\",\"type\":\"string\"},{\"name\":\"birthDate\",\"type\":\"long\"}]}"}
|
@@ -0,0 +1 @@
|
|
1
|
+
{"schema":"{\"type\":\"record\",\"name\":\"Person\",\"namespace\":\"com.ippontech.kafkatutorials\",\"fields\":[{\"name\":\"firstName\",\"type\":\"string\"},{\"name\":\"lastName\",\"type\":\"string\"},{\"name\":\"birthDate\",\"type\":\"long\"},{\"name\":\"verified\",\"type\":\"boolean\",\"default\":false}]}"}
|
@@ -0,0 +1 @@
|
|
1
|
+
{"schema":"{\"type\":\"record\",\"name\":\"Person\",\"namespace\":\"com.ippontech.kafkatutorials\",\"fields\":[{\"name\":\"firstName\",\"type\":\"string\"},{\"name\":\"lastName\",\"type\":\"string\"},{\"name\":\"birthDate\",\"type\":\"long\"},{\"name\":\"verified\",\"type\":\"boolean\"}]}"}
|
@@ -0,0 +1 @@
|
|
1
|
+
{"schema":"{\"type\":\"record\",\"name\":\"Person\",\"namespace\":\"com.ippontech.kafkatutorials\",\"fields\":[{\"name\":\"firstName\",\"type\":\"string\"},{\"name\":\"lastName\",\"type\":\"string\"},{\"name\":\"birthDate\",\"type\":\"long\"},{\"name\":\"verified\",\"type\":[\"boolean\",\"null\"],\"default\":false}]}"}
|
@@ -76,74 +76,94 @@ class AvroParserTest < Test::Unit::TestCase
|
|
76
76
|
}
|
77
77
|
EOC
|
78
78
|
|
79
|
-
|
79
|
+
data("use_confluent_schema" => true,
|
80
|
+
"plain" => false)
|
81
|
+
def test_parse(data)
|
82
|
+
config = data
|
80
83
|
conf = {
|
81
|
-
'schema_json' => SCHEMA
|
84
|
+
'schema_json' => SCHEMA,
|
85
|
+
'use_confluent_schema' => config,
|
82
86
|
}
|
83
87
|
d = create_driver(conf)
|
84
88
|
datum = {"username" => "foo", "age" => 42, "verified" => true}
|
85
|
-
encoded = encode_datum(datum, SCHEMA)
|
89
|
+
encoded = encode_datum(datum, SCHEMA, config)
|
86
90
|
d.instance.parse(encoded) do |_time, record|
|
87
91
|
assert_equal datum, record
|
88
92
|
end
|
89
93
|
|
90
94
|
datum = {"username" => "baz", "age" => 34}
|
91
|
-
encoded = encode_datum(datum, SCHEMA)
|
95
|
+
encoded = encode_datum(datum, SCHEMA, config)
|
92
96
|
d.instance.parse(encoded) do |_time, record|
|
93
97
|
assert_equal datum.merge("verified" => nil), record
|
94
98
|
end
|
95
99
|
end
|
96
100
|
|
97
|
-
|
101
|
+
data("use_confluent_schema" => true,
|
102
|
+
"plain" => false)
|
103
|
+
def test_parse_with_avro_schema(data)
|
104
|
+
config = data
|
98
105
|
conf = {
|
99
|
-
'schema_file' => File.join(__dir__, "..", "data", "user.avsc")
|
106
|
+
'schema_file' => File.join(__dir__, "..", "data", "user.avsc"),
|
107
|
+
'use_confluent_schema' => config,
|
100
108
|
}
|
101
109
|
d = create_driver(conf)
|
102
110
|
datum = {"username" => "foo", "age" => 42, "verified" => true}
|
103
|
-
encoded = encode_datum(datum, SCHEMA)
|
111
|
+
encoded = encode_datum(datum, SCHEMA, config)
|
104
112
|
d.instance.parse(encoded) do |_time, record|
|
105
113
|
assert_equal datum, record
|
106
114
|
end
|
107
115
|
|
108
116
|
datum = {"username" => "baz", "age" => 34}
|
109
|
-
encoded = encode_datum(datum, SCHEMA)
|
117
|
+
encoded = encode_datum(datum, SCHEMA, config)
|
110
118
|
d.instance.parse(encoded) do |_time, record|
|
111
119
|
assert_equal datum.merge("verified" => nil), record
|
112
120
|
end
|
113
121
|
end
|
114
122
|
|
115
|
-
|
123
|
+
data("use_confluent_schema" => true,
|
124
|
+
"plain" => false)
|
125
|
+
def test_parse_with_readers_and_writers_schema(data)
|
126
|
+
config = data
|
116
127
|
conf = {
|
117
128
|
'writers_schema_json' => SCHEMA,
|
118
129
|
'readers_schema_json' => READERS_SCHEMA,
|
130
|
+
'use_confluent_schema' => config,
|
119
131
|
}
|
120
132
|
d = create_driver(conf)
|
121
133
|
datum = {"username" => "foo", "age" => 42, "verified" => true}
|
122
|
-
encoded = encode_datum(datum, SCHEMA)
|
134
|
+
encoded = encode_datum(datum, SCHEMA, config)
|
123
135
|
d.instance.parse(encoded) do |_time, record|
|
124
136
|
datum.delete("verified")
|
125
137
|
assert_equal datum, record
|
126
138
|
end
|
127
139
|
end
|
128
140
|
|
129
|
-
|
141
|
+
data("use_confluent_schema" => true,
|
142
|
+
"plain" => false)
|
143
|
+
def test_parse_with_readers_and_writers_schema_files(data)
|
144
|
+
config = data
|
130
145
|
conf = {
|
131
146
|
'writers_schema_file' => File.join(__dir__, "..", "data", "writer_user.avsc"),
|
132
147
|
'readers_schema_file' => File.join(__dir__, "..", "data", "reader_user.avsc"),
|
148
|
+
'use_confluent_schema' => config,
|
133
149
|
}
|
134
150
|
d = create_driver(conf)
|
135
151
|
datum = {"username" => "foo", "age" => 42, "verified" => true}
|
136
|
-
encoded = encode_datum(datum, SCHEMA)
|
152
|
+
encoded = encode_datum(datum, SCHEMA, config)
|
137
153
|
d.instance.parse(encoded) do |_time, record|
|
138
154
|
datum.delete("verified")
|
139
155
|
assert_equal datum, record
|
140
156
|
end
|
141
157
|
end
|
142
158
|
|
143
|
-
|
159
|
+
data("use_confluent_schema" => true,
|
160
|
+
"plain" => false)
|
161
|
+
def test_parse_with_complex_schema(data)
|
162
|
+
config = data
|
144
163
|
conf = {
|
145
164
|
'schema_json' => COMPLEX_SCHEMA,
|
146
|
-
'time_key' => 'time'
|
165
|
+
'time_key' => 'time',
|
166
|
+
'use_confluent_schema' => config,
|
147
167
|
}
|
148
168
|
d = create_driver(conf)
|
149
169
|
time_str = "2020-09-25 15:08:09.082113 +0900"
|
@@ -162,7 +182,7 @@ class AvroParserTest < Test::Unit::TestCase
|
|
162
182
|
}
|
163
183
|
}
|
164
184
|
|
165
|
-
encoded = encode_datum(datum, COMPLEX_SCHEMA)
|
185
|
+
encoded = encode_datum(datum, COMPLEX_SCHEMA, config)
|
166
186
|
d.instance.parse(encoded) do |time, record|
|
167
187
|
assert_equal Time.parse(time_str).to_r, time.to_r
|
168
188
|
datum.delete("time")
|
@@ -185,6 +205,22 @@ class AvroParserTest < Test::Unit::TestCase
|
|
185
205
|
res.status = 200
|
186
206
|
res.body = 'running'
|
187
207
|
end
|
208
|
+
server.mount_proc("/schemas/ids") do |req, res|
|
209
|
+
req.path =~ /^\/schemas\/ids\/([^\/]*)$/
|
210
|
+
version = $1
|
211
|
+
@got.push({
|
212
|
+
version: version,
|
213
|
+
})
|
214
|
+
if version == "1"
|
215
|
+
res.body = File.read(File.join(__dir__, "..", "data", "schema-persions-value-1.avsc"))
|
216
|
+
elsif version == "21"
|
217
|
+
res.body = File.read(File.join(__dir__, "..", "data", "schema-persions-value-21.avsc"))
|
218
|
+
elsif version == "41"
|
219
|
+
res.body = File.read(File.join(__dir__, "..", "data", "schema-persions-value-41.avsc"))
|
220
|
+
elsif version == "42"
|
221
|
+
res.body = File.read(File.join(__dir__, "..", "data", "schema-persions-value-42.avsc"))
|
222
|
+
end
|
223
|
+
end
|
188
224
|
server.mount_proc("/subjects") do |req, res|
|
189
225
|
req.path =~ /^\/subjects\/([^\/]*)\/([^\/]*)\/(.*)$/
|
190
226
|
avro_registered_name = $1
|
@@ -204,6 +240,8 @@ class AvroParserTest < Test::Unit::TestCase
|
|
204
240
|
res.body = File.read(File.join(__dir__, "..", "data", "persons-avro-value3.avsc"))
|
205
241
|
elsif version == "4"
|
206
242
|
res.body = File.read(File.join(__dir__, "..", "data", "persons-avro-value4.avsc"))
|
243
|
+
elsif version == "latest"
|
244
|
+
res.body = File.read(File.join(__dir__, "..", "data", "persons-avro-value4.avsc"))
|
207
245
|
end
|
208
246
|
end
|
209
247
|
server.start
|
@@ -318,63 +356,109 @@ class AvroParserTest < Test::Unit::TestCase
|
|
318
356
|
assert_equal 4, @got.size
|
319
357
|
assert_equal 'persons-avro-value', @got[3][:registered_name]
|
320
358
|
assert_equal '3', @got[3][:version]
|
359
|
+
|
360
|
+
assert_equal '200', client.request_get('/schemas/ids/1').code
|
361
|
+
assert_equal 5, @got.size
|
362
|
+
assert_nil @got[4][:registered_name]
|
363
|
+
assert_equal '1', @got[4][:version]
|
364
|
+
|
365
|
+
assert_equal '200', client.request_get('/schemas/ids/21').code
|
366
|
+
assert_equal 6, @got.size
|
367
|
+
assert_nil @got[5][:registered_name]
|
368
|
+
assert_equal '21', @got[5][:version]
|
369
|
+
|
370
|
+
assert_equal '200', client.request_get('/schemas/ids/41').code
|
371
|
+
assert_equal 7, @got.size
|
372
|
+
assert_nil @got[6][:registered_name]
|
373
|
+
assert_equal '41', @got[6][:version]
|
374
|
+
|
375
|
+
assert_equal '200', client.request_get('/schemas/ids/42').code
|
376
|
+
assert_equal 8, @got.size
|
377
|
+
assert_nil @got[7][:registered_name]
|
378
|
+
assert_equal '42', @got[7][:version]
|
321
379
|
end
|
322
380
|
|
323
|
-
|
381
|
+
data("use_confluent_schema" => true,
|
382
|
+
"plain" => false)
|
383
|
+
def test_schema_url(data)
|
384
|
+
config = data
|
324
385
|
conf = {
|
325
386
|
'schema_url' => "http://localhost:8081/subjects/persons-avro-value/versions/1",
|
326
|
-
'schema_url_key' => 'schema'
|
387
|
+
'schema_url_key' => 'schema',
|
388
|
+
'use_confluent_schema' => config,
|
327
389
|
}
|
328
390
|
d = create_driver(conf)
|
329
391
|
datum = {"firstName" => "Aleen","lastName" => "Terry","birthDate" => 159202477258}
|
330
|
-
encoded = encode_datum(datum, REMOTE_SCHEMA)
|
392
|
+
encoded = encode_datum(datum, REMOTE_SCHEMA, config)
|
331
393
|
d.instance.parse(encoded) do |_time, record|
|
332
394
|
assert_equal datum, record
|
333
395
|
end
|
334
396
|
end
|
335
397
|
|
336
|
-
|
398
|
+
data("use_confluent_schema" => true,
|
399
|
+
"plain" => false)
|
400
|
+
def test_schema_url_with_version2(data)
|
401
|
+
config = data
|
337
402
|
conf = {
|
338
403
|
'schema_url' => "http://localhost:8081/subjects/persons-avro-value/versions/2",
|
339
|
-
'schema_url_key' => 'schema'
|
404
|
+
'schema_url_key' => 'schema',
|
405
|
+
'use_confluent_schema' => config,
|
340
406
|
}
|
341
407
|
d = create_driver(conf)
|
342
408
|
datum = {"firstName" => "Aleen","lastName" => "Terry","birthDate" => 159202477258}
|
343
|
-
encoded = encode_datum(datum, REMOTE_SCHEMA2)
|
409
|
+
encoded = encode_datum(datum, REMOTE_SCHEMA2, config)
|
344
410
|
d.instance.parse(encoded) do |_time, record|
|
345
411
|
assert_equal datum.merge("verified" => false), record
|
346
412
|
end
|
347
413
|
end
|
348
414
|
|
349
|
-
def
|
350
|
-
conf =
|
351
|
-
'
|
352
|
-
|
353
|
-
|
415
|
+
def test_confluent_registry_with_schema_version
|
416
|
+
conf = Fluent::Config::Element.new(
|
417
|
+
'', '', {'@type' => 'avro'}, [
|
418
|
+
Fluent::Config::Element.new('confluent_registry', '', {
|
419
|
+
'url' => 'http://localhost:8081',
|
420
|
+
'subject' => 'persons-avro-value',
|
421
|
+
'schema_key' => 'schema',
|
422
|
+
'schema_version' => '1',
|
423
|
+
}, [])
|
424
|
+
])
|
354
425
|
d = create_driver(conf)
|
355
426
|
datum = {"firstName" => "Aleen","lastName" => "Terry","birthDate" => 159202477258}
|
356
|
-
|
427
|
+
schema = Yajl.load(File.read(File.join(__dir__, "..", "data", "schema-persions-value-1.avsc")))
|
428
|
+
encoded = encode_datum(datum, schema.fetch("schema"), true, 1)
|
357
429
|
d.instance.parse(encoded) do |_time, record|
|
358
|
-
assert_equal datum
|
430
|
+
assert_equal datum, record
|
359
431
|
end
|
360
432
|
end
|
361
433
|
|
362
|
-
def
|
363
|
-
conf =
|
364
|
-
'
|
365
|
-
|
366
|
-
|
367
|
-
|
368
|
-
|
434
|
+
def test_confluent_registry_with_fallback
|
435
|
+
conf = Fluent::Config::Element.new(
|
436
|
+
'', '', {'@type' => 'avro'}, [
|
437
|
+
Fluent::Config::Element.new('confluent_registry', '', {
|
438
|
+
'url' => 'http://localhost:8081',
|
439
|
+
'subject' => 'persons-avro-value',
|
440
|
+
'schema_key' => 'schema',
|
441
|
+
}, [])
|
442
|
+
])
|
443
|
+
d = create_driver(conf)
|
444
|
+
datum = {"firstName" => "Aleen","lastName" => "Terry","birthDate" => 159202477258}
|
445
|
+
schema = Yajl.load(File.read(File.join(__dir__, "..", "data", "schema-persions-value-1.avsc")))
|
446
|
+
encoded = encode_datum(datum, schema.fetch("schema"), true, 1)
|
447
|
+
d.instance.parse(encoded) do |_time, record|
|
448
|
+
assert_equal datum, record
|
369
449
|
end
|
370
450
|
end
|
371
451
|
end
|
372
452
|
|
373
453
|
private
|
374
454
|
|
375
|
-
def encode_datum(datum, string_schema)
|
455
|
+
def encode_datum(datum, string_schema, use_confluent_schema = true, schema_id = 1)
|
376
456
|
buffer = StringIO.new
|
377
457
|
encoder = Avro::IO::BinaryEncoder.new(buffer)
|
458
|
+
if use_confluent_schema
|
459
|
+
encoder.write(Fluent::Plugin::AvroParser::MAGIC_BYTE)
|
460
|
+
encoder.write([schema_id].pack("N"))
|
461
|
+
end
|
378
462
|
schema = Avro::Schema.parse(string_schema)
|
379
463
|
writer = Avro::IO::DatumWriter.new(schema)
|
380
464
|
writer.write(datum, encoder)
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: fluent-plugin-parser-avro
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.2.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Hiroshi Hatake
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2020-09-
|
11
|
+
date: 2020-09-30 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: avro
|
@@ -100,13 +100,18 @@ files:
|
|
100
100
|
- LICENSE
|
101
101
|
- README.md
|
102
102
|
- Rakefile
|
103
|
-
- fluent-plugin-avro.gemspec
|
103
|
+
- fluent-plugin-parser-avro.gemspec
|
104
|
+
- lib/fluent/plugin/confluent_avro_schema_registry.rb
|
104
105
|
- lib/fluent/plugin/parser_avro.rb
|
105
106
|
- test/data/persons-avro-value.avsc
|
106
107
|
- test/data/persons-avro-value2.avsc
|
107
108
|
- test/data/persons-avro-value3.avsc
|
108
109
|
- test/data/persons-avro-value4.avsc
|
109
110
|
- test/data/reader_user.avsc
|
111
|
+
- test/data/schema-persions-value-1.avsc
|
112
|
+
- test/data/schema-persions-value-21.avsc
|
113
|
+
- test/data/schema-persions-value-41.avsc
|
114
|
+
- test/data/schema-persions-value-42.avsc
|
110
115
|
- test/data/user.avsc
|
111
116
|
- test/data/writer_user.avsc
|
112
117
|
- test/helper.rb
|
@@ -140,6 +145,10 @@ test_files:
|
|
140
145
|
- test/data/persons-avro-value3.avsc
|
141
146
|
- test/data/persons-avro-value4.avsc
|
142
147
|
- test/data/reader_user.avsc
|
148
|
+
- test/data/schema-persions-value-1.avsc
|
149
|
+
- test/data/schema-persions-value-21.avsc
|
150
|
+
- test/data/schema-persions-value-41.avsc
|
151
|
+
- test/data/schema-persions-value-42.avsc
|
143
152
|
- test/data/user.avsc
|
144
153
|
- test/data/writer_user.avsc
|
145
154
|
- test/helper.rb
|