fluent-plugin-parser-avro 0.1.0 → 0.2.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: accbc87b882c2e1e0999a9992da794fbde6be745f240e043e77e92d9964406ac
4
- data.tar.gz: 817d393998e57f8ba2352f405e565b0c59de34e3c92b7cbc38653453a621fd4a
3
+ metadata.gz: 9633a316c1de1e4cb83b487d99d5a3814f1cde2b96d20633cfe728c584108761
4
+ data.tar.gz: 6438fe5e248c98540602183a5828a7dbc596fe5000c4993fd428ee9223ed3077
5
5
  SHA512:
6
- metadata.gz: 2dda0e434a73d6e3d1380c3e96e5cfe1595c3041bfe4afa9b99b125bcc8b19c7177139860ea108c7523f418d5fd589bc5edcd673d3025f628c0b47f2ca664c31
7
- data.tar.gz: 4277e90990b38c9d72e5982f817d02c39f57592af64f2b03c54af9ec708b57908f3a13bd41de3ad2a5e82794b4612b555888ace4e10ec197d728e84b89d4718e
6
+ metadata.gz: 8b8894c17aaa33916ef3601634d36139f2817db875eba2c5e552a0309f2c5f3ce6578526c5aad54244cd51ce36cf30bc04f64ce576fcbd4ccb89fc7726d4102d
7
+ data.tar.gz: a9dbc9e144b1e0f7f2b585a51c0519e6c2953d7258b5bd13fd8d6246754f9956bffcee43777378692ea4ef2d8997d8f6648653d8928e2345e70c32e48ab849a6
data/LICENSE CHANGED
@@ -187,7 +187,7 @@
187
187
  same "printed page" as the copyright notice for easier
188
188
  identification within third-party archives.
189
189
 
190
- Copyright 2020- Hiroshi Hatake
190
+ Copyright [yyyy] [name of copyright owner]
191
191
 
192
192
  Licensed under the Apache License, Version 2.0 (the "License");
193
193
  you may not use this file except in compliance with the License.
data/README.md CHANGED
@@ -32,12 +32,22 @@ $ bundle
32
32
  * **schema_file** (string) (optional): avro schema file path.
33
33
  * **schema_json** (string) (optional): avro schema definition hash.
34
34
  * **schema_url** (string) (optional): avro schema remote URL.
35
- * **schema_registery_with_subject_url** (string) (optional): avro schema registry URL.
36
35
  * **schema_url_key** (string) (optional): avro schema registry or something's response schema key.
37
36
  * **writers_schema_file** (string) (optional): avro schema file path for writers definition.
38
37
  * **writers_schema_json** (string) (optional): avro schema definition hash for writers definition.
39
38
  * **readers_schema_file** (string) (optional): avro schema file path for readers definition.
40
39
  * **readers_schema_json** (string) (optional): avro schema definition hash for readers definition.
40
+ * **use_confluent_schema** (bool) (optional): Assume to use confluent schema. Confluent avro schema uses the first 5-bytes for magic byte (1 byte) and schema_id (4 bytes). This parameter specifies to skip reading the first 5-bytes or not.
41
+ * Default value: `true`.
42
+
43
+ ### \<confluent_registry\> section (optional) (single)
44
+
45
+ * **url** (string) (required): confluent schema registry URL.
46
+ * **subject** (string) (required): Specify schema subject.
47
+ * **schema_key** (string) (optional): Specify schema key on confluent registry REST API response.
48
+ * Default value: `schema`.
49
+ * **schema_version** (string) (optional): Specify schema version for the specified subject.
50
+ * Default value: `latest`.
41
51
 
42
52
  ### Configuration Example
43
53
 
@@ -48,7 +58,14 @@ $ bundle
48
58
  # schema_json { "namespace": "org.fluentd.parser.avro", "type": "record", "name": "User", "fields" : [{"name": "username", "type": "string"}, {"name": "age", "type": "int"}, {"name": "verified", "type": ["boolean", "null"], "default": false}]}
49
59
  # schema_url http(s)://[server fqdn]:[port]/subjects/[a great user's subject]/[the latest schema version]
50
60
  # schema_key schema
51
- # schema_registery_with_subject_url http(s)://[server fqdn]:[port]/subjects/[a great user's subject]/
61
+ # When using with confluent registry without <confluent_registry>, this parameter must be true.
62
+ # use_confluent_schema true
63
+ #<confluent_registry>
64
+ # url http://localhost:8081/
65
+ # subject your-awesome-subject
66
+ # # schema_key schema
67
+ # # schema_version 1
68
+ #</confluent_registry>
52
69
  </parse>
53
70
  ```
54
71
 
@@ -58,22 +75,44 @@ Confluent AVRO schema registry should respond with REST API.
58
75
 
59
76
  This plugin uses the following API:
60
77
 
61
- * [`GET /subjects/(string: subject)/versions`](https://docs.confluent.io/current/schema-registry/develop/api.html#get--subjects-(string-%20subject)-versions)
62
78
  * [`GET /subjects/(string: subject)/versions/(versionId: version)`](https://docs.confluent.io/current/schema-registry/develop/api.html#get--subjects-(string-%20subject)-versions)
63
79
 
64
- Users can specify a URL for retrieving the latest schemna information:
80
+ Users can specify a URL for retrieving the latest schemna information with `<confluent_registry>`:
65
81
 
66
- e.g.) `http(s)://[server fqdn]:[port]/subjects/[a great user's subject]/`
82
+ e.g.)
83
+ ```
84
+ <confluent_registry>
85
+ url http://[confluent registry server ip]:[port]/
86
+ subject your-awesome-subject
87
+ # schema_key schema
88
+ # schema_version 1
89
+ </confluent_registry>
90
+ ```
67
91
 
68
92
  For example, when specifying the following configuration:
69
93
 
70
94
  ```
71
95
  <parse>
72
96
  @type avro
73
- schema_registery_with_subject_url http://localhost:8081/subjects/persons-avro-value/
97
+ <confluent_registry>
98
+ url http://localhost:8081/
99
+ subject persons-avro-value
100
+ # schema_key schema
101
+ # schema_version 1
102
+ </confluent_registry>
74
103
  ```
75
104
 
76
- Then the parser plugin calls `GET http://localhost:8081/subjects/persons-avro-value/versions/` to retrive the registered schema versions and then calls `GET GET http://localhost:8081/subjects/persons-avro-value/versions/<the latest schema version>`.
105
+ Then the parser plugin calls `GET http://localhost:8081/subjects/persons-avro-value/versions/latest` to retrive the registered schema versions. And when parsing failure occurred, this plugin will call `GET http://localhost:8081/schemas/ids/<schema id which is obtained from the second record on avro schema>`.
106
+
107
+ If you use this plugin to parse confluent schema, please specify `use_confluent_schema` as `true`.
108
+
109
+ This is because, confluent avro schema uses the following structure:
110
+
111
+ MAGIC_BYTE | schema_id | record
112
+ ----------:|:---------:|:---------------
113
+ 1byte | 4bytes | record contents
114
+
115
+ When specifying `<confluent_registry>` section on configuration, this plugin will skip to read the first 5-bytes automatically and parse `schema_id` from there.
77
116
 
78
117
  ## Copyright
79
118
 
@@ -3,7 +3,7 @@ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
3
3
 
4
4
  Gem::Specification.new do |spec|
5
5
  spec.name = "fluent-plugin-parser-avro"
6
- spec.version = "0.1.0"
6
+ spec.version = "0.2.0"
7
7
  spec.authors = ["Hiroshi Hatake"]
8
8
  spec.email = ["cosmo0920.wp@gmail.com"]
9
9
 
@@ -0,0 +1,49 @@
1
+ #
2
+ # Copyright 2020- Hiroshi Hatake
3
+ #
4
+ # Licensed under the Apache License, Version 2.0 (the "License");
5
+ # you may not use this file except in compliance with the License.
6
+ # You may obtain a copy of the License at
7
+ #
8
+ # http://www.apache.org/licenses/LICENSE-2.0
9
+ #
10
+ # Unless required by applicable law or agreed to in writing, software
11
+ # distributed under the License is distributed on an "AS IS" BASIS,
12
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13
+ # See the License for the specific language governing permissions and
14
+ # limitations under the License.
15
+
16
+ require "net/http"
17
+ require "uri"
18
+
19
+ module Fluent
20
+ module Plugin
21
+ class ConfluentAvroSchemaRegistry
22
+ def initialize(registry_url)
23
+ @registry_url = registry_url
24
+ end
25
+
26
+ def subject_version(subject, schema_key, version = "latest")
27
+ registry_uri = URI.parse(@registry_url)
28
+ registry_uri_with_versions = URI.join(registry_uri, "/subjects/#{subject}/versions/#{version}")
29
+ response = Net::HTTP.get_response(registry_uri_with_versions)
30
+ if schema_key.nil?
31
+ response.body
32
+ else
33
+ Yajl.load(response.body)[schema_key]
34
+ end
35
+ end
36
+
37
+ def schema_with_id(schema_id, schema_key)
38
+ registry_uri = URI.parse(@registry_url)
39
+ registry_uri_with_ids = URI.join(registry_uri, "/schemas/ids/#{schema_id}")
40
+ response = Net::HTTP.get_response(registry_uri_with_ids)
41
+ if schema_key.nil?
42
+ response.body
43
+ else
44
+ Yajl.load(response.body)[schema_key]
45
+ end
46
+ end
47
+ end
48
+ end
49
+ end
@@ -18,21 +18,30 @@ require "net/http"
18
18
  require "stringio"
19
19
  require "uri"
20
20
  require "fluent/plugin/parser"
21
+ require_relative "./confluent_avro_schema_registry"
21
22
 
22
23
  module Fluent
23
24
  module Plugin
24
25
  class AvroParser < Fluent::Plugin::Parser
25
26
  Fluent::Plugin.register_parser("avro", self)
26
27
 
28
+ MAGIC_BYTE = [0].pack("C").freeze
29
+
27
30
  config_param :schema_file, :string, :default => nil
28
31
  config_param :schema_json, :string, :default => nil
29
32
  config_param :schema_url, :string, :default => nil
30
- config_param :schema_registery_with_subject_url, :string, :default => nil
31
33
  config_param :schema_url_key, :string, :default => nil
32
34
  config_param :writers_schema_file, :string, :default => nil
33
35
  config_param :writers_schema_json, :string, :default => nil
34
36
  config_param :readers_schema_file, :string, :default => nil
35
37
  config_param :readers_schema_json, :string, :default => nil
38
+ config_param :use_confluent_schema, :bool, :default => true
39
+ config_section :confluent_registry, param_name: :avro_registry, required: false, multi: false do
40
+ config_param :url, :string
41
+ config_param :subject, :string
42
+ config_param :schema_key, :string, :default => "schema"
43
+ config_param :schema_version, :string, :default => "latest"
44
+ end
36
45
 
37
46
  def configure(conf)
38
47
  super
@@ -60,18 +69,20 @@ module Fluent
60
69
  @writers_schema = Avro::Schema.parse(@writers_raw_schema)
61
70
  @readers_schema = Avro::Schema.parse(@readers_raw_schema)
62
71
  @reader = Avro::IO::DatumReader.new(@writers_schema, @readers_schema)
72
+ elsif @avro_registry
73
+ @confluent_registry = Fluent::Plugin::ConfluentAvroSchemaRegistry.new(@avro_registry.url)
74
+ @raw_schema = @confluent_registry.subject_version(@avro_registry.subject,
75
+ @avro_registry.schema_key,
76
+ @avro_registry.schema_version)
77
+ @schema = Avro::Schema.parse(@raw_schema)
78
+ @reader = Avro::IO::DatumReader.new(@schema)
63
79
  else
64
- unless [@schema_json, @schema_file, @schema_url, @schema_registery_with_subject_url].compact.size == 1
80
+ unless [@schema_json, @schema_file, @schema_url].compact.size == 1
65
81
  raise Fluent::ConfigError, "schema_json, schema_file, or schema_url is required, but they cannot specify at the same time!"
66
82
  end
67
- if @schema_registery_with_subject_url && !@schema_registery_with_subject_url.end_with?("/")
68
- raise Fluent::ConfigError, "schema_registery_with_subject_url must contain the trailing slash('/')."
69
- end
70
83
 
71
84
  @raw_schema = if @schema_file
72
85
  File.read(@schema_file)
73
- elsif @schema_registery_with_subject_url
74
- fetch_latest_schema(@schema_registery_with_subject_url, @schema_url_key)
75
86
  elsif @schema_url
76
87
  fetch_schema(@schema_url, @schema_url_key)
77
88
  elsif @schema_json
@@ -91,25 +102,55 @@ module Fluent
91
102
  buffer = StringIO.new(data)
92
103
  decoder = Avro::IO::BinaryDecoder.new(buffer)
93
104
  begin
105
+ if @use_confluent_schema || @avro_registry
106
+ # When using confluent avro schema, record is formatted as follows:
107
+ #
108
+ # MAGIC_BYTE | schema_id | record
109
+ # ----------:|:---------:|:---------------
110
+ # 1byte | 4bytes | record contents
111
+ magic_byte = decoder.read(1)
112
+
113
+ if magic_byte != MAGIC_BYTE
114
+ raise "The first byte should be magic byte but got {magic_byte.inspect}"
115
+ end
116
+ schema_id = decoder.read(4).unpack("N").first
117
+ end
94
118
  decoded_data = @reader.read(decoder)
95
119
  time, record = convert_values(parse_time(decoded_data), decoded_data)
96
120
  yield time, record
97
- rescue => e
98
- raise e if @schema_url.nil? or @schema_registery_with_subject_url.nil?
121
+ rescue EOFError, RuntimeError => e
122
+ raise e unless [@schema_url, @avro_registry].compact.size == 1
99
123
  begin
100
124
  new_raw_schema = if @schema_url
101
125
  fetch_schema(@schema_url, @schema_url_key)
102
- elsif @schema_registery_with_subject_url
103
- fetch_latest_schema(@schema_registery_with_subject_url, @schema_url_key)
126
+ elsif @avro_registry
127
+ @confluent_registry.schema_with_id(schema_id,
128
+ @avro_registry.schema_key)
104
129
  end
105
130
  new_schema = Avro::Schema.parse(new_raw_schema)
106
- is_changed = (new_raw_schena_== @raw_schema)
131
+ is_changed = (new_raw_schema != @raw_schema)
107
132
  @raw_schema = new_raw_schema
108
- @schame = new_schema
109
- rescue
133
+ @schema = new_schema
134
+ rescue EOFError, RuntimeError
110
135
  # Do nothing.
111
136
  end
112
137
  if is_changed
138
+ buffer = StringIO.new(data)
139
+ decoder = Avro::IO::BinaryDecoder.new(buffer)
140
+ if @use_confluent_schema || @avro_registry
141
+ # When using confluent avro schema, record is formatted as follows:
142
+ #
143
+ # MAGIC_BYTE | schema_id | record
144
+ # ----------:|:---------:|:---------------
145
+ # 1byte | 4bytes | record contents
146
+ magic_byte = decoder.read(1)
147
+
148
+ if magic_byte != MAGIC_BYTE
149
+ raise "The first byte should be magic byte but got {magic_byte.inspect}"
150
+ end
151
+ schema_id = decoder.read(4).unpack("N").first
152
+ end
153
+ @reader = Avro::IO::DatumReader.new(@schema)
113
154
  decoded_data = @reader.read(decoder)
114
155
  time, record = convert_values(parse_time(decoded_data), decoded_data)
115
156
  yield time, record
@@ -119,24 +160,6 @@ module Fluent
119
160
  end
120
161
  end
121
162
 
122
- def fetch_schema_versions(base_uri_with_versions)
123
- versions_response = Net::HTTP.get_response(base_uri_with_versions)
124
- Yajl.load(versions_response.body)
125
- end
126
-
127
- def fetch_latest_schema(base_url, schema_key)
128
- base_uri = URI.parse(base_url)
129
- base_uri_with_versions = URI.join(base_uri, "versions/")
130
- versions = fetch_schema_versions(base_uri_with_versions)
131
- uri = URI.join(base_uri_with_versions, versions.last.to_s)
132
- response = Net::HTTP.get_response(uri)
133
- if schema_key.nil?
134
- response.body
135
- else
136
- Yajl.load(response.body)[schema_key]
137
- end
138
- end
139
-
140
163
  def fetch_schema(url, schema_key)
141
164
  uri = URI.parse(url)
142
165
  response = Net::HTTP.get_response(uri)
@@ -0,0 +1 @@
1
+ {"schema":"{\"type\":\"record\",\"name\":\"Person\",\"namespace\":\"com.ippontech.kafkatutorials\",\"fields\":[{\"name\":\"firstName\",\"type\":\"string\"},{\"name\":\"lastName\",\"type\":\"string\"},{\"name\":\"birthDate\",\"type\":\"long\"}]}"}
@@ -0,0 +1 @@
1
+ {"schema":"{\"type\":\"record\",\"name\":\"Person\",\"namespace\":\"com.ippontech.kafkatutorials\",\"fields\":[{\"name\":\"firstName\",\"type\":\"string\"},{\"name\":\"lastName\",\"type\":\"string\"},{\"name\":\"birthDate\",\"type\":\"long\"},{\"name\":\"verified\",\"type\":\"boolean\",\"default\":false}]}"}
@@ -0,0 +1 @@
1
+ {"schema":"{\"type\":\"record\",\"name\":\"Person\",\"namespace\":\"com.ippontech.kafkatutorials\",\"fields\":[{\"name\":\"firstName\",\"type\":\"string\"},{\"name\":\"lastName\",\"type\":\"string\"},{\"name\":\"birthDate\",\"type\":\"long\"},{\"name\":\"verified\",\"type\":\"boolean\"}]}"}
@@ -0,0 +1 @@
1
+ {"schema":"{\"type\":\"record\",\"name\":\"Person\",\"namespace\":\"com.ippontech.kafkatutorials\",\"fields\":[{\"name\":\"firstName\",\"type\":\"string\"},{\"name\":\"lastName\",\"type\":\"string\"},{\"name\":\"birthDate\",\"type\":\"long\"},{\"name\":\"verified\",\"type\":[\"boolean\",\"null\"],\"default\":false}]}"}
@@ -76,74 +76,94 @@ class AvroParserTest < Test::Unit::TestCase
76
76
  }
77
77
  EOC
78
78
 
79
- def test_parse
79
+ data("use_confluent_schema" => true,
80
+ "plain" => false)
81
+ def test_parse(data)
82
+ config = data
80
83
  conf = {
81
- 'schema_json' => SCHEMA
84
+ 'schema_json' => SCHEMA,
85
+ 'use_confluent_schema' => config,
82
86
  }
83
87
  d = create_driver(conf)
84
88
  datum = {"username" => "foo", "age" => 42, "verified" => true}
85
- encoded = encode_datum(datum, SCHEMA)
89
+ encoded = encode_datum(datum, SCHEMA, config)
86
90
  d.instance.parse(encoded) do |_time, record|
87
91
  assert_equal datum, record
88
92
  end
89
93
 
90
94
  datum = {"username" => "baz", "age" => 34}
91
- encoded = encode_datum(datum, SCHEMA)
95
+ encoded = encode_datum(datum, SCHEMA, config)
92
96
  d.instance.parse(encoded) do |_time, record|
93
97
  assert_equal datum.merge("verified" => nil), record
94
98
  end
95
99
  end
96
100
 
97
- def test_parse_with_avro_schema
101
+ data("use_confluent_schema" => true,
102
+ "plain" => false)
103
+ def test_parse_with_avro_schema(data)
104
+ config = data
98
105
  conf = {
99
- 'schema_file' => File.join(__dir__, "..", "data", "user.avsc")
106
+ 'schema_file' => File.join(__dir__, "..", "data", "user.avsc"),
107
+ 'use_confluent_schema' => config,
100
108
  }
101
109
  d = create_driver(conf)
102
110
  datum = {"username" => "foo", "age" => 42, "verified" => true}
103
- encoded = encode_datum(datum, SCHEMA)
111
+ encoded = encode_datum(datum, SCHEMA, config)
104
112
  d.instance.parse(encoded) do |_time, record|
105
113
  assert_equal datum, record
106
114
  end
107
115
 
108
116
  datum = {"username" => "baz", "age" => 34}
109
- encoded = encode_datum(datum, SCHEMA)
117
+ encoded = encode_datum(datum, SCHEMA, config)
110
118
  d.instance.parse(encoded) do |_time, record|
111
119
  assert_equal datum.merge("verified" => nil), record
112
120
  end
113
121
  end
114
122
 
115
- def test_parse_with_readers_and_writers_schema
123
+ data("use_confluent_schema" => true,
124
+ "plain" => false)
125
+ def test_parse_with_readers_and_writers_schema(data)
126
+ config = data
116
127
  conf = {
117
128
  'writers_schema_json' => SCHEMA,
118
129
  'readers_schema_json' => READERS_SCHEMA,
130
+ 'use_confluent_schema' => config,
119
131
  }
120
132
  d = create_driver(conf)
121
133
  datum = {"username" => "foo", "age" => 42, "verified" => true}
122
- encoded = encode_datum(datum, SCHEMA)
134
+ encoded = encode_datum(datum, SCHEMA, config)
123
135
  d.instance.parse(encoded) do |_time, record|
124
136
  datum.delete("verified")
125
137
  assert_equal datum, record
126
138
  end
127
139
  end
128
140
 
129
- def test_parse_with_readers_and_writers_schema_files
141
+ data("use_confluent_schema" => true,
142
+ "plain" => false)
143
+ def test_parse_with_readers_and_writers_schema_files(data)
144
+ config = data
130
145
  conf = {
131
146
  'writers_schema_file' => File.join(__dir__, "..", "data", "writer_user.avsc"),
132
147
  'readers_schema_file' => File.join(__dir__, "..", "data", "reader_user.avsc"),
148
+ 'use_confluent_schema' => config,
133
149
  }
134
150
  d = create_driver(conf)
135
151
  datum = {"username" => "foo", "age" => 42, "verified" => true}
136
- encoded = encode_datum(datum, SCHEMA)
152
+ encoded = encode_datum(datum, SCHEMA, config)
137
153
  d.instance.parse(encoded) do |_time, record|
138
154
  datum.delete("verified")
139
155
  assert_equal datum, record
140
156
  end
141
157
  end
142
158
 
143
- def test_parse_with_complex_schema
159
+ data("use_confluent_schema" => true,
160
+ "plain" => false)
161
+ def test_parse_with_complex_schema(data)
162
+ config = data
144
163
  conf = {
145
164
  'schema_json' => COMPLEX_SCHEMA,
146
- 'time_key' => 'time'
165
+ 'time_key' => 'time',
166
+ 'use_confluent_schema' => config,
147
167
  }
148
168
  d = create_driver(conf)
149
169
  time_str = "2020-09-25 15:08:09.082113 +0900"
@@ -162,7 +182,7 @@ class AvroParserTest < Test::Unit::TestCase
162
182
  }
163
183
  }
164
184
 
165
- encoded = encode_datum(datum, COMPLEX_SCHEMA)
185
+ encoded = encode_datum(datum, COMPLEX_SCHEMA, config)
166
186
  d.instance.parse(encoded) do |time, record|
167
187
  assert_equal Time.parse(time_str).to_r, time.to_r
168
188
  datum.delete("time")
@@ -185,6 +205,22 @@ class AvroParserTest < Test::Unit::TestCase
185
205
  res.status = 200
186
206
  res.body = 'running'
187
207
  end
208
+ server.mount_proc("/schemas/ids") do |req, res|
209
+ req.path =~ /^\/schemas\/ids\/([^\/]*)$/
210
+ version = $1
211
+ @got.push({
212
+ version: version,
213
+ })
214
+ if version == "1"
215
+ res.body = File.read(File.join(__dir__, "..", "data", "schema-persions-value-1.avsc"))
216
+ elsif version == "21"
217
+ res.body = File.read(File.join(__dir__, "..", "data", "schema-persions-value-21.avsc"))
218
+ elsif version == "41"
219
+ res.body = File.read(File.join(__dir__, "..", "data", "schema-persions-value-41.avsc"))
220
+ elsif version == "42"
221
+ res.body = File.read(File.join(__dir__, "..", "data", "schema-persions-value-42.avsc"))
222
+ end
223
+ end
188
224
  server.mount_proc("/subjects") do |req, res|
189
225
  req.path =~ /^\/subjects\/([^\/]*)\/([^\/]*)\/(.*)$/
190
226
  avro_registered_name = $1
@@ -204,6 +240,8 @@ class AvroParserTest < Test::Unit::TestCase
204
240
  res.body = File.read(File.join(__dir__, "..", "data", "persons-avro-value3.avsc"))
205
241
  elsif version == "4"
206
242
  res.body = File.read(File.join(__dir__, "..", "data", "persons-avro-value4.avsc"))
243
+ elsif version == "latest"
244
+ res.body = File.read(File.join(__dir__, "..", "data", "persons-avro-value4.avsc"))
207
245
  end
208
246
  end
209
247
  server.start
@@ -318,63 +356,109 @@ class AvroParserTest < Test::Unit::TestCase
318
356
  assert_equal 4, @got.size
319
357
  assert_equal 'persons-avro-value', @got[3][:registered_name]
320
358
  assert_equal '3', @got[3][:version]
359
+
360
+ assert_equal '200', client.request_get('/schemas/ids/1').code
361
+ assert_equal 5, @got.size
362
+ assert_nil @got[4][:registered_name]
363
+ assert_equal '1', @got[4][:version]
364
+
365
+ assert_equal '200', client.request_get('/schemas/ids/21').code
366
+ assert_equal 6, @got.size
367
+ assert_nil @got[5][:registered_name]
368
+ assert_equal '21', @got[5][:version]
369
+
370
+ assert_equal '200', client.request_get('/schemas/ids/41').code
371
+ assert_equal 7, @got.size
372
+ assert_nil @got[6][:registered_name]
373
+ assert_equal '41', @got[6][:version]
374
+
375
+ assert_equal '200', client.request_get('/schemas/ids/42').code
376
+ assert_equal 8, @got.size
377
+ assert_nil @got[7][:registered_name]
378
+ assert_equal '42', @got[7][:version]
321
379
  end
322
380
 
323
- def test_schema_url
381
+ data("use_confluent_schema" => true,
382
+ "plain" => false)
383
+ def test_schema_url(data)
384
+ config = data
324
385
  conf = {
325
386
  'schema_url' => "http://localhost:8081/subjects/persons-avro-value/versions/1",
326
- 'schema_url_key' => 'schema'
387
+ 'schema_url_key' => 'schema',
388
+ 'use_confluent_schema' => config,
327
389
  }
328
390
  d = create_driver(conf)
329
391
  datum = {"firstName" => "Aleen","lastName" => "Terry","birthDate" => 159202477258}
330
- encoded = encode_datum(datum, REMOTE_SCHEMA)
392
+ encoded = encode_datum(datum, REMOTE_SCHEMA, config)
331
393
  d.instance.parse(encoded) do |_time, record|
332
394
  assert_equal datum, record
333
395
  end
334
396
  end
335
397
 
336
- def test_schema_url_with_version2
398
+ data("use_confluent_schema" => true,
399
+ "plain" => false)
400
+ def test_schema_url_with_version2(data)
401
+ config = data
337
402
  conf = {
338
403
  'schema_url' => "http://localhost:8081/subjects/persons-avro-value/versions/2",
339
- 'schema_url_key' => 'schema'
404
+ 'schema_url_key' => 'schema',
405
+ 'use_confluent_schema' => config,
340
406
  }
341
407
  d = create_driver(conf)
342
408
  datum = {"firstName" => "Aleen","lastName" => "Terry","birthDate" => 159202477258}
343
- encoded = encode_datum(datum, REMOTE_SCHEMA2)
409
+ encoded = encode_datum(datum, REMOTE_SCHEMA2, config)
344
410
  d.instance.parse(encoded) do |_time, record|
345
411
  assert_equal datum.merge("verified" => false), record
346
412
  end
347
413
  end
348
414
 
349
- def test_schema_registery_with_subject_url
350
- conf = {
351
- 'schema_registery_with_subject_url' => "http://localhost:8081/subjects/persons-avro-value/",
352
- 'schema_url_key' => 'schema'
353
- }
415
+ def test_confluent_registry_with_schema_version
416
+ conf = Fluent::Config::Element.new(
417
+ '', '', {'@type' => 'avro'}, [
418
+ Fluent::Config::Element.new('confluent_registry', '', {
419
+ 'url' => 'http://localhost:8081',
420
+ 'subject' => 'persons-avro-value',
421
+ 'schema_key' => 'schema',
422
+ 'schema_version' => '1',
423
+ }, [])
424
+ ])
354
425
  d = create_driver(conf)
355
426
  datum = {"firstName" => "Aleen","lastName" => "Terry","birthDate" => 159202477258}
356
- encoded = encode_datum(datum, REMOTE_SCHEMA2)
427
+ schema = Yajl.load(File.read(File.join(__dir__, "..", "data", "schema-persions-value-1.avsc")))
428
+ encoded = encode_datum(datum, schema.fetch("schema"), true, 1)
357
429
  d.instance.parse(encoded) do |_time, record|
358
- assert_equal datum.merge("verified" => nil), record
430
+ assert_equal datum, record
359
431
  end
360
432
  end
361
433
 
362
- def test_schema_registery_with_invalid_subject_url
363
- conf = {
364
- 'schema_registery_with_subject_url' => "http://localhost:8081/subjects/persons-avro-value",
365
- 'schema_url_key' => 'schema'
366
- }
367
- assert_raise(Fluent::ConfigError) do
368
- create_driver(conf)
434
+ def test_confluent_registry_with_fallback
435
+ conf = Fluent::Config::Element.new(
436
+ '', '', {'@type' => 'avro'}, [
437
+ Fluent::Config::Element.new('confluent_registry', '', {
438
+ 'url' => 'http://localhost:8081',
439
+ 'subject' => 'persons-avro-value',
440
+ 'schema_key' => 'schema',
441
+ }, [])
442
+ ])
443
+ d = create_driver(conf)
444
+ datum = {"firstName" => "Aleen","lastName" => "Terry","birthDate" => 159202477258}
445
+ schema = Yajl.load(File.read(File.join(__dir__, "..", "data", "schema-persions-value-1.avsc")))
446
+ encoded = encode_datum(datum, schema.fetch("schema"), true, 1)
447
+ d.instance.parse(encoded) do |_time, record|
448
+ assert_equal datum, record
369
449
  end
370
450
  end
371
451
  end
372
452
 
373
453
  private
374
454
 
375
- def encode_datum(datum, string_schema)
455
+ def encode_datum(datum, string_schema, use_confluent_schema = true, schema_id = 1)
376
456
  buffer = StringIO.new
377
457
  encoder = Avro::IO::BinaryEncoder.new(buffer)
458
+ if use_confluent_schema
459
+ encoder.write(Fluent::Plugin::AvroParser::MAGIC_BYTE)
460
+ encoder.write([schema_id].pack("N"))
461
+ end
378
462
  schema = Avro::Schema.parse(string_schema)
379
463
  writer = Avro::IO::DatumWriter.new(schema)
380
464
  writer.write(datum, encoder)
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: fluent-plugin-parser-avro
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Hiroshi Hatake
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2020-09-29 00:00:00.000000000 Z
11
+ date: 2020-09-30 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: avro
@@ -100,13 +100,18 @@ files:
100
100
  - LICENSE
101
101
  - README.md
102
102
  - Rakefile
103
- - fluent-plugin-avro.gemspec
103
+ - fluent-plugin-parser-avro.gemspec
104
+ - lib/fluent/plugin/confluent_avro_schema_registry.rb
104
105
  - lib/fluent/plugin/parser_avro.rb
105
106
  - test/data/persons-avro-value.avsc
106
107
  - test/data/persons-avro-value2.avsc
107
108
  - test/data/persons-avro-value3.avsc
108
109
  - test/data/persons-avro-value4.avsc
109
110
  - test/data/reader_user.avsc
111
+ - test/data/schema-persions-value-1.avsc
112
+ - test/data/schema-persions-value-21.avsc
113
+ - test/data/schema-persions-value-41.avsc
114
+ - test/data/schema-persions-value-42.avsc
110
115
  - test/data/user.avsc
111
116
  - test/data/writer_user.avsc
112
117
  - test/helper.rb
@@ -140,6 +145,10 @@ test_files:
140
145
  - test/data/persons-avro-value3.avsc
141
146
  - test/data/persons-avro-value4.avsc
142
147
  - test/data/reader_user.avsc
148
+ - test/data/schema-persions-value-1.avsc
149
+ - test/data/schema-persions-value-21.avsc
150
+ - test/data/schema-persions-value-41.avsc
151
+ - test/data/schema-persions-value-42.avsc
143
152
  - test/data/user.avsc
144
153
  - test/data/writer_user.avsc
145
154
  - test/helper.rb